Reproducers & Scrapers – Legal or Illegal?

Over the past few years there appears to have been a growing trend in ‘reproducers’ and ‘scrapers’ on the Internet.

So firstly, so that we’re all on singing from the same hymn sheet (so to speak), lets clarify what they are.  ‘Reproducers’ and ‘scrapers’ basically hunt around the Internet collecting forum posts, blog posts, articles etc from all the websites it can find.  It then displays them on their own website.  Some of the ‘better ones’ promote some kind of purpose to this, for example, to promote critical discussion from a wider audience that you may not otherwise reach.  Some, although not many, analyse what they have collected, perform statistical analysis and then post this with the content and a link back.  Others just reproduce your content verbatim and that is it.

But, with ever growing numbers of reproducers and scrapers on the Internet, the question of its legality is being raised by the online community, and if it hadn’t been for an untimely global financial crisis, the courts may well be interested.

Well the courts were, in a round about way, interested in the more broader issue that this covers.  There was a big scandal regarding SOPA, PIPA (both USA based legislation) and ACTA (EU based legislation).  Now, the online uprising against these pieces of legislation was most definitely justified, as they contained completely ridiculous clauses and regulations and most importantly, were not going to achieve the aims that politicians thought they would (we won’t get started on how politicians and computing issues don’t mix).  Now, the very welcome news that the European Parliament had rejected ACTA overwhelmingly was fantastic news (and was covered by the new British Tech Blog, inbritech.com), but the question remains about computing issues and how to deal with them.

Now, a reproducer or a scraper is basically an automated, computing copyright thief.  I draw this conclusion based on the fact that they copy content from other websites verbatim.  I have a little more sympathy for those sites that use reproducers and scrapers to promote discussion on the contents (with proper referencing of course), perform statistical analysis, data mining and so on.  And that is a little more, and I mean a little little more than the zero sympathy that I have for the former reproducers and scrapers mentioned.

Now, in the UK, upon creation, all unique created content is automatically given copyright protection with the protections assigned to the author pursuant to the Copyright, Designs and Patents Act 1988.  Now surely these reproducers and scrapers (and therefore their creators and owners) are breaching the owners copyright assigned by the previously mentioned legislation?  My expertise lies in computing, not law, so I can’t give a cast iron legal opinion on it.  But the one thing that I can say is that if by some miracle it isn’t illegal, it is most definitely immoral.

So, what are the ways of tackling this?  Well, we could all just stop producing content.  But where would the fun be in that?  And why should we, we’ve done nothing wrong!  New legislation?  Well I’m for new legislation when its required, but what is required in this situation is the enforcement of current legislation.

Maybe one of the reasons that current copyright legislation can’t be enforced in this circumstance is because it is a non-living machine that is actually committing the copyright theft?  I would however concede that this isn’t likely, as has been shown previously, computer viruses are non-living programs however there has been famous cases in the past of their creators being (rightly) held responsible for their actions in creating the virus.

Now don’t get me wrong, we shouldn’t start a witch hunt here.  The 13 year old boy that has plagiarised one of your articles shouldn’t be sent to prison for a century or anything drastic like that.  It’s not about targeting individuals, it’s about targeting these websites which basically copy verbatim, collate and publish it to their own websites.

It also does raise the question, what is the point of reproducers and scrapers that just duplicate previously published content?  It is good, for example, for an author of content to share that content and have multiple website publish the information – that way it gets a wider audience, greater debate etc – for example, we do this on our DPS Computing guest articles occasionally.  However, the key difference here is that permission is obtained and the copyright holder actually wants the information reproduced.

We already know by now that major search engines, such as Google, like leaders and not sheep.  They prefer the people (and therefore the websites) that create new content, rather than the people (and therefore websites) that reproduce the content at a later time.

Overall, I think we can safely conclude that reproducers and scrapers that copy content verbatim from across the Internet are illegal, by the definition of law in most countries.  This is even before we get on to the point that most serve absolutely no purpose other than to clog up the search results pages.

It’d be interesting to hear from any of our visitors / members in the legal profession as well, to clarify the situation regarding the legality of reproducers and scrapers.

PS.  In a few hours this article will have been no doubt, reproduced by hundreds of reproducers and scrapers.  How ironic! ;).

DPS David:

View Comments (4)

  • While I'm not a lawyer (nor is this guaranteed to be correct at all), I would assume that you would be legally responsible for creating an application that facilitates copyright infringement? What does The Pirate Bay do? Don't they simply facilitate the ability to infringe on the intellectual property rights of another under copyright law? Because no copyrighted material is stored on servers owned by the The Pirate Bay; all that is stored on their servers are essentially data to be able to download the entire package of a torrent from multiple users. Hence, in this case, my conclusion to what The Pirate Bay are doing is nothing more than facilitating the infringement of copyright and I would assume it is in violation of British law to knowingly facilitate copyright infringement.

  • Yes, I would tend to agree. For example, if we consider the following situation: a terrorist makes a bomb and plants it in a shopping centre. They escape and the bomb explodes killing many people. Now it isn't an adequate defence for that terrorist to say 'well it wasn't me that killed everyone, it was the bomb'. While in language terms that is a correct statement, in legal terms, had it not been for that terrorist making and planting the bomb, those people wouldn't have died. So therefore that is the reason why I think legally the creators of reproducers & scrapers are held responsible. And if they're not, they should be. And it is most definitely amoral.

    Yes, The Pirate Bay is a very complex issue. I can see it from both sides. On the one side they are facilitating copyright infringement. On the other side, I guess they would say, as you rightly pointed out Ben, a) they don't host the copyrighted content. I also think they would say b) the don't even create the torrent files to the copyrighted content, these are user generated and uploaded and c) If users stopped sharing copyrighted content that they didn't have a right to share, then there would be no issue.

    It's definitely a very complex issue and I'm sure there will be many more twists and turns in The Pirate Bay saga before its over.

  • Hi David,

    But I think under the law The Pirate Bay (as well as anyone else) has a responsibility to remove material on their service if it is facilitating the infringement of copyright law and they have been notified about this. Well, definitely if the court declares the material in violation of copyright law ;).

  • Well yes, that is also an issue, and I'm most definitely not saying that they are in the right, just playing devils advocate and seeing it from both sides.

    In the past, sites such as The Pirate Bay have said that it is impossible to manually filter user submissions due to the sheer number of them and it isn't technically possible to implement an automated filtering system.

    Evidently, sites such as The Pirate Bay aren't likely to have dedicated any time to these issues, as this would be counterproductive to their popularity.