Jump to content

Primary: Sky Slate Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate Marble
Secondary: Sky Slate Blackcurrant Watermelon Strawberry Orange Banana Apple Emerald Chocolate Marble
Pattern: Blank Waves Squares Notes Sharp Wood Rockface Leather Honey Vertical Triangles
Photo

Batoto becoming registered only?


  • This topic is locked This topic is locked
512 replies to this topic

#181
Grumpy

Grumpy

    RawR

  • Administrators
  • 4,078 posts
  • LocationHere of course!

Can you instead make heavy limitations for casuals readers? For instance: "3 chapters and 10 comic page views per hour", with a suggestion to register. Kinda like file hosters do.
Crawlers will find a way around registration, but lazy users probably just leave.

That's sort of what's being tried as of now. But the crawlers will simply mask themselves as another and repeat.

 

Indeed they will figure a way around the registration. But as I've said, it's not to block them out completely. It's to put it to a state where it's manageable. Not every single crawlers will bother to perfect the art of crawling.



#182
Shirahago

Shirahago

    Potato Sprout

  • Members
  • 3 posts

Excuse my ignorance but what exactly is the reason to use an aggregator site instead of batoto? Even the less experienced reader should be aware that releases on these sites have a worse quality than here. Only a very minor amount of series that are available on other sites are not available here so why are they reading a chapter in mediocre quality when they can have exactly the same for free in better quality?

 

I'm on board with whatever Grumpy choses to do but a more long-term solution will mostly be in giving the people who are reading on other sites enough incentive to read on batoto instead, thus making it less and less lucrative for crawlers, hopefully in a magnitude large enough where they actually die out.


Edited by Shirahago, 21 October 2015 - 10:16 AM.


#183
leemeru

leemeru

    Potato Sprout

  • Members
  • 2 posts
  • Locationouterspace

As a staff member of a group who has been using batoto as a release outlet for our chapters (since ours has died due to the amount of requests and bandwith issues from releases) I think something like what halo mentioned up top is a good idea.  Allowing each ip to only read 2-3 chapters total without having to sign up, this way groups that use batoto as a release outlet won't give the website as many new member leechers who only come to read those chapters.



#184
kanadaj

kanadaj

    Potato Sprout

  • Members
  • 9 posts

 

What makes you think the bots need to screen cap?

 

It's either text scraping or screen scraping - from the point of Batoto, it's essentially the same thing. They download the full page content that other users see, forcing the templating engine to generate hundreds of thousands of pages in very short amount of time. That, multiple times a day, for all the hundreds of aggregators.

 

The obvious solution is that if you can't stop the bots from doing it, you need to give them a way to do it with less impact on you or your site. An API can reduce the number of requests the robots make and get rid of the overhead of page templating.



#185
Guest_BAnon_von_Kartoffel

Guest_BAnon_von_Kartoffel
  • Guests

OFF Topic

 

I think I said in my original comment of how it was uploader who should have right to decide in it,if they are content with delayed access to non members who cares,it is the scanlator choice on this matter 

I think since you are a contributor you might know this they just need to add a small feature of wether to delay or not their scans.

I wan't talking about the delay in general when I quoted you, but about the idea to only show portions of the uploaded chapters for non-member. It would bring nothing to the table. You either make everything accessable for everyone or delay some stuff. but baiting people into registrating by showing them a part of a chapter would be disgusting. This isn't EBJ, we don't own shit, so why would you do that. It might seem like people would register right away to read their mangos, but I bet that most would feel pissed off and just use any other manga reader. Batoto would become a memesite until some new batoto 2.0 appears that doesn't have these flaws and at some point overcomes this site.



#186
Ayreos

Ayreos

    Potato Sprout

  • Members
  • 9 posts

After reading various ideas, registration + limiting how many pages each account can load in a given time seems the only way to me. Further, only one to three accounts per IP should be allowed to be logged in at the same time. The page load limit can be invisible to users, but stop most crawlers from wasting bandwidth. That's the level where coding workarounds becomes a hassle, in my eyes.



#187
Harshrox3

Harshrox3

    Potato Spud

  • Contributor
  • 20 posts
  • LocationIndia

I wan't talking about the delay in general when I quoted you, but about the idea to only show portions of the uploaded chapters for non-member. It would bring nothing to the table. You either make everything accessable for everyone or delay some stuff. but baiting people into registrating by showing them a part of a chapter would be disgusting. This isn't EBJ, we don't own shit, so why would you do that. It might seem like people would register right away to read their mangos, but I bet that most would feel pissed off and just use any other manga reader. Batoto would become a memesite until some new batoto 2.0 appears that doesn't have these flaws and at some point overcomes this site.

 ok ,gotcha


The opinions expressed by this user are solely their own and do not express the views of Batoto and its staff.

#188
illya_

illya_

    Potato Sprout

  • Contributor
  • 4 posts

Well after reading to Grumpy's clarification I guess forced registrations are more or less inevitable.



#189
Volandum

Volandum

    Potato Spud

  • Donator
  • 18 posts

I've been pretty much doing that for 5 years now. Doesn't help that number of aggregates in the world are on the rise and only so much can be cached.

 

 

As I suggested, what about simply providing the aggregators with an honest list of recent updates to public-accessible pages, and then just caching those?



#190
sneezemonkey

sneezemonkey

    Potato

  • Members
  • 122 posts

As a staff member of a group who has been using batoto as a release outlet for our chapters (since ours has died due to the amount of requests and bandwith issues from releases) I think something like what halo mentioned up top is a good idea.  Allowing each ip to only read 2-3 chapters total without having to sign up, this way groups that use batoto as a release outlet won't give the website as many new member leechers who only come to read those chapters.

The problem is that these bots have a large pool of IP so this doesn't work. It's way more effective to tie the load rate to an account.

 

 

It's either text scraping or screen scraping - from the point of Batoto, it's essentially the same thing. They download the full page content that other users see, forcing the templating engine to generate hundreds of thousands of pages in very short amount of time. That, multiple times a day, for all the hundreds of aggregators.

 

The obvious solution is that if you can't stop the bots from doing it, you need to give them a way to do it with less impact on you or your site. An API can reduce the number of requests the robots make and get rid of the overhead of page templating.

Or you could figure out the pattern of the website layout and Ajax all the img urls and auto compress the image size to save server space, which incidentally what they are doing right now. They only really need consider screen capping if the images are split into sections.  And I also think that doing a semiprivate model and limiting the older stuff to memebers is going to be a lot more effective at combating the deep crawlers (as has been previously stated) since you can tie the request rate to a particular account.


Edited by sneezemonkey, 21 October 2015 - 11:11 AM.

Tired of halved double page spreads? Want to read manga like an actual tankoubon? Just want to load all pages in a chapter at once?

Try Manga OnlineViewer Fluid Mode+ Now!!!!


#191
Halo

Halo

    Potato

  • Donator
  • 171 posts

That's sort of what's being tried as of now. But the crawlers will simply mask themselves as another and repeat.

Just now I was able to open 10 tabs as a guest with the "one page viewer" extension, almost instantly loading 100~ pages. 

To cover the entire catalog with the "3 chapters and 10 comic pages per hour" limitation, crawlers will need 1500~ ips, around 56k IPs if they wish to load 170k English chapters. That's quite expensive... You might as well just provide them with API and charge money.


Edited by Halo, 21 October 2015 - 11:32 AM.


#192
Irdianth

Irdianth

    Potato Sprout

  • Members
  • 1 posts

Err, a few cents from a ghost...

 

Is it possible to gain from the crawlers?

I know you don't have a lot of ads here, but if you can just put enough ads so that those crawlers will pay for themselves by crawling your site won't the problem be solved?

As a user, I don't really mind ads as long as they're not distracting and obstrusive.



#193
aviar

aviar

    Fingerling Potato

  • Members
  • 64 posts

First time poster, essentially made an account for the Follow tool (a life saver). First, sites great, and the staff are internet heroes for providing such a convenient resource. Second, two ideas to toss out there.

 

On the matter of crawling, it sounds like the problem is trawling for new manga and updates with respect to bots. If you can't beat them, it might alleviate some of the stress on the servers to implement an API that facilitates retrieving such material (eg: avoid constant manual crawls over the complete dataset). No real numbers behind that or anything, just an idea.

 

If an API is out of the question (technically unfeasible, non-optimum solution, heretical, etc.) you might look at network traffic classification as a possible technical solution. I don't have experience with this either, glancing over a couple papers it seems like a promising solution as long as actually classifying all incoming traffic doesn't cost more than the odd bot.


Edited by aviar, 21 October 2015 - 12:03 PM.

I have come to warn you of the things beyond the wall and the men behind the machines.


#194
Aereus

Aereus

    Fingerling Potato

  • Contributor
  • 92 posts

Or not, why bother? They will just go to any of the dozen established sites that will keep crawling out from batoto and read there their chapters. At some point, they might prefer dealing with ads than dealing with registering.

 

You see, registering in a site means that you have to keep another user and pass for another site. Another site to remember or the password that you just used for 10 sites now got 11 (and pray that the site doesn't get hacked).

 

And then, you just encouraged other sites to develop bots to crawl batoto, as their ad revenue will be higher than before.

 

You guys are making this out to be more than it is. Literally all it means is you need to log in to see the content. It's not "private" in that you need an invite, or there's only so many that can get in, etc. There are tons of forums out there that require registering before seeing some of the forums. This is no different. And for a site like this that you would use quite often, it's not that much to ask. You people are seriously amazing complaining about this. This isn't some random forum somewhere you needed to register to ask a question once, never to go there again.

 

And you apparently miss the point entirely. It's not that the others have ads. It's that they're completely unscrupulous and without morals and couldn't give a fuck what happens to any of the series. They just want money. You're basically saying I'm apathetic, I'm going to support assholes bc meh.     Why?

 

What you're arguing for is for him to do nothing and shut the site down instead. What's happening currently is imagine you run a restaurant with seating for 50 people. It's lunch-time and you have that 1-hour period to make most of your sales for the day. 45/50 are people that come in, sit down, order a water then drag out their time and order nothing then leave at 1pm. And they come in and do this every day. You're losing out on a huge chunk of your sales bc these people are clogging up the restaurant doing you no good. Even worse, they come tromping in with muddy boots and get it everywhere causing lots of cleanup. Should the restaurant kick them out? Or let them stay and go out of business?



#195
Jnight

Jnight

    Potato Sprout

  • Members
  • 2 posts
  • LocationNewfoundland, Canada

So I don't really post here often but I made an account to save my personal settings so this change will unlikely effect my PC Bato.to browsing. However I use the Manga Rock App on Android and use Bato.to as its only source. I am assuming this change will make that no longer possible meaning I would have to use some random aggregate site for my app, which I would rather not do.

 

Has the possibly of Bato.to having a mobile app ever been talked about? It could be another source for revenue as I would gladly but a Bato.to app if it functioned similar to Manga rock app.    


TFW waiting for the next chapter of your favorite manga!

 

sWNZreJ.gif


#196
Aereus

Aereus

    Fingerling Potato

  • Contributor
  • 92 posts

As I suggested, what about simply providing the aggregators with an honest list of recent updates to public-accessible pages, and then just caching those?

 

Isn't that pretty much what the new releases portion on the front page is for? I can't believe the bots crawl the entire database rather than just linking off the new releases? Isn't that way more efficient?



#197
TheGrimReaper

TheGrimReaper

    Potato Spud

  • Members
  • 20 posts
  • LocationMalaysia

I care less either way. I would support this. But the big reason I like reading from this site cos its neat, clean and organised and most of all, no shitty ads slap all over the site and these damn app ads that would infect shit when I read on my tablet.

And why people think batato would lose new people? If you google a manga, it would most likely to come up on the top page as batato link. Just the choice to either sign up, which takes 2 steps like moving your mouse from left to right, or just close your browser. Difficult yeah?

Oh, and one more thing about member only privatizing, it would be moooooooore great that you are allowed to read some of those 18+ mangas that were blocked numerous time before.

 

 

waht u want this manga site to be like manga/doujintoshokan? have the "secret" sister site?



#198
maffa

maffa

    Russet Potato

  • Members
  • 209 posts
  • Locationitaly

registration required, plus a random email once in a while to ask for confirmation to avoid fake mails



#199
Grumpy

Grumpy

    RawR

  • Administrators
  • 4,078 posts
  • LocationHere of course!

Just now I was able to open 10 tabs as a guest with the "one page viewer" extension, almost instantly loading 100~ pages. 

To cover the entire catalog with the "3 chapters and 10 comic pages per hour" limitation, crawlers will need 1500~ ips, around 56k IPs if they wish to load 170k English chapters. That's quite expensive... You might as well just provide them with API and charge money.

I don't really want to spill all the beans on what's already being done, but it allows some initial bursts right now. Precisely for people who use extensions like that.

I don't think anyone's trying to download half the site in a single day. I think the current biggest hitters are comic page refreshes for new chapters.

 

Isn't that pretty much what the new releases portion on the front page is for? I can't believe the bots crawl the entire database rather than just linking off the new releases? Isn't that way more efficient?

The front page doesn't actually contain all the chapters. If a series have like 20 new chapters at once, the mods will hide like 17 of them to prevent front page/rss feed flood. There is also the uploader's ability to "archive" them.

 

Err, a few cents from a ghost...

 

Is it possible to gain from the crawlers?

I know you don't have a lot of ads here, but if you can just put enough ads so that those crawlers will pay for themselves by crawling your site won't the problem be solved?

As a user, I don't really mind ads as long as they're not distracting and obstrusive.

crawlers don't load ads. Even if they did, they wouldn't click it or buy from it. Even if all that was possible, that's be breaking the terms with ad company since it's fake views.

 

If an API is out of the question (technically unfeasible, non-optimum solution, heretical, etc.) you might look at network traffic classification as a possible technical solution. I don't have experience with this either, glancing over a couple papers it seems like a promising solution as long as actually classifying all incoming traffic doesn't cost more than the odd bot.

I honestly don't want to go the API route. Among 10 pages of people commenting here, it was mentioned a few times now. But here's why I don't want it:

  • We'll basically be acknowledging that we are a re-distribution source. That's not who we want to be.
  • This may end up spawning even more aggregates out there as it makes it to be an aggregate than ever before. Then we'd be back here with the same problem. API's in general are often paid or there is a separate benefit for providing it that increases their business. This would do neither for us. If we charge, they'll probably just go the free and current route.
  • There's also issue of trust. That they'd trust our API and conversely they would actually use the API instead of just crawling. There's no motive for them to change at all to the API if it's already working.
  • APIs aren't magically super efficient. It'd still have do an extensive search into our database for obscure titles. If I make a per comic api, it'd be really no more significantly efficient than it is now.


#200
Speedit

Speedit

    Potato Spud

  • Members
  • 25 posts
  • LocationUnited Kingdom

Which means I won't be able to download chapters from Batoto :(

Yeah I know that bypassing the ads by directly downloading the images is greedy and cruel on your servers but think about the people with low internet speeds. It takes some time to load a single image(it's like 1-2 minute for One Piece and JoJo colored comics which is really long) but downloading chapters using a download manager or FMD gives us the privilege of reading without waiting between two pages. It would be great if something is done about it. 

 

There is an existing solution for this. Please browse through Batoto Plus - a free extension on Chrome that accelerates image loading speed and preloads images - and change the skin to a more lightweight one.

 

Oh, and pop a link to Batoto Plus in your post as well. We want to educate as many users as possible about this amazing extension =D