Stolen Content: Is it Scraped, Plagiarism, or Copyright Infringement?

Anytime I write a blog post that gets a track back from another post, I look into it. I like to see how other blog’s use my post as a resource.

So when a blog I wrote (6 Subheadings Strategies You Need to Know) received a trackback from another post (Get Easy Scannability with Subheadings), I went to check it out. But I was disappointed to find that my article was not referenced as a resource. Instead it was only mentioned at the end of the article under an “Acknowledgement.”

Acknowledgement Link

But even more troubling, I found that my article wasn’t a resource for the blog post — it was the blog post.

The introductions were slightly different. A few sentences and the conclusion were removed, but the body of the content — the meat — was all mine. Even the image, selected and used by the publisher Blog World, was the same.

Original Post From Blog World

original

Post from Social Media Company

copy

A Copyscape report showed that 366 words matched 54% of the original text. You can see all of the pink text that was copied from the original.

Copyscape Plagiarism Checker

Is This Plagiarism?

In my mind this seemed like an obvious case of plagiarism. I wanted to know how sites got away with this, so I reached out to Richard Stim (an attorney and author who specializes in copyright, patents, and trademark issues) and explained the situation to him.

But his explanation didn’t really back up my plagiarism accusation.

“What you describe is not plagiarism but it may be copyright infringement.

Plagiarism is not a legal standard – it’s when someone else poses as the originator of your words or ideas.

Copyright infringement is a legal standard – it’s when someone makes unauthorized use of your copyrighted material.”

To explain the difference between plagiarism and copyright infringement, Stim showed me an example of plagiarism that he encountered at his company Nolo, an online library of legal information.

What Is Plagiarism?

Stim’s team wrote post for the Nolo blog (Ten Tips for Songwriters: Credits, Copyrights, and Coauthors).

According to Stim, their article was then plagiarized and posted on the Find Law blog (Songwriter Tips for Copyright, Credit, and Royalties).

In this case, the content was not copied word for word. Instead the second article is an extremely close representation of the original author’s words and ideas — which fits within the definition of plagiarism.

Original Post from Nolo

Number 8

Post from Find Law

Duplicate One

Original Post from Nolo

number 10

Post from Find Law

Second Paragraph

As you can see, strings of words in these two articles don’t exactly match up. But the message of each section contains almost the exact same messages. Stim says this is an example of plagiarism.

Copyright infringement, on the other hand, is when someone makes unauthorized use of your original work — reproducing your content as is. Copyright infringement is the more accurate way to describe what happened to me (and Blog World, I might add).

Is It Copyright Infringement if There Is Attribution?

In my situation, the publisher for Social Media Company attributed the post through an “acknowledgement” and a link. I asked Stim if that attribution is enough? Does acknowledgement give the right to reproduce content? Does it negate the copyright infringement?

“Attribution has zero effect on determining whether copyright infringement has occurred.

It’s kind of like if someone stole your car but was nice enough to paint your name on the side; you’re still the victim of car theft.

It’s possible that attribution may be considered as a mitigating factor when determining damages for copyright infringement (which is more likely if the borrower sought to locate the copyright owner but failed) or a judge may consider it an aggravating factor (proof that the borrower knew you were the author but went ahead and ripped you off anyway).”

So it seems that attribution doesn’t right the wrong — especially in my case, where the acknowledgment came right before the “author” bio (which seems like an oxymoron to me).

Acknowledgement

Is Scraping Content Stealing?

After looking at a few other articles on Social Media Company’s blog, it became pretty evident that the site is scraping content.

“Scraping Content” is an automated process that scrapes a site, copies its content, and republishes it on another site.

Social Media Company isn’t the only company scraping content. All content that can be viewed on a webpage can be scraped, so many sites take this as an opportunity to find content relevant to their audience and swipe it. It’s also not likely to stop anytime soon.

Google doesn’t like it either and is doing their best to protect writers and publishers. This excerpt from their Webmaster Tools breaks down their views on scraped content.

Google Scrapped content explaination

But the fact of the matter remains, a lot of websites are scraping and stealing content.

What Should I Do?

I asked Stim what he recommend authors and/or publishers do if this happens to them.

“That’s a tough call. You can write to the borrower and ask them to remove it, citing your copyright ownership. You can exert rights under the DMCA and have the borrower’s ISP take down the entry. But filing a lawsuit would probably not be worth your efforts unless you could demonstrate a financial injury that would justify the attorney fees.”

So scraped and stolen, what’s the next step?

For me… nothing.

I’m not going after Social Media Company, and I doubt that Blog World will either. When it comes to scrapers that steal content, the best thing to do is protect your content from being scrapped in the first place.

While it isn’t really worth it to take legal action, you could contact the site and ask them to remove it. Or you can report the site to Google if you want to take it one step further.

But if it’s more than a scraper and someone is out there stealing your work (copyright infringement) or your ideas (plagiarism), at least now you know the difference and can be armed to fight the fight… when it’s worth it.

Richard Stim is an author and attorney specializing in small business, copyright, patents, and trademark issues. He writes about these topics on his blog Dear Rich Blog and for the blog on Nolo.com, one of the largest online resources for legal information. Stay tuned for more posts featuring answers from our interview with him.

This article does not intend to provide any legal advice whatsoever.

Comments

  1. says

    So, the fine line between plagiarism and copyright infringement is that plagiarism is somewhat paraphrased. Seems the legal loopholes are what keeps scrapers in business.

    • Neil says

      Plagiarism is irrelevant in terms of the law, that’s all. It is just intellectual dishonesty.

      Plagiarised content is copyright infringement anyway, so what Mr Stim was effectively saying was forget plagiarism, look at it as a copyright infringement issue. No loopholes.

    • Neil says

      The more I think about it, the more I think publishing a licence fee is the way to go (see my post below). Blatant copyright infringement is actually quite easy to prove and I think you would probably win a court case. But what damages is the judge going to award? Probably none, although by winning the case the defendant would have to pay the £100 or so it costs for you to file a claim and bring it to court.

  2. Neil says

    I would suggest posting a copyright notice on your page. Although an author will automatically get copyright, the copyright symbol and a severely worded warning could act as a deterrent and would support you in any court case.

    Damages is an interesting one. Photography agencies (notoriously Getty Images) publish a license fee for use of their images and claim damages in terms of the revenue they ‘lost’ through a license fee not being paid. In my opinion, these companies are going to the other extreme by aggressively targeting vulnerable individuals and small businesses with extortionate demands for payment – but I digress.

    Perhaps blog writers could start doing something similar by publishing a licence rate for use of their blog post.

    Personally, I would imagine the potential publicity from somebody visiting your site (especially if there is a live link) would outweigh the fact of copyright infringement.

    • says

      Great points Neil. I’m really fascinated with the current and future state of copy infringement on the Internet. The copyright notice that you mention definitely sounds like the next step in my research. And I think there is some validity to your comment about the licence rate for blog posts. I like that idea.

  3. Cara says

    I hate sites that scrape content! It happens to me all the time. I usually send an email demanding that my content be removed, but this has only been effective a handful of times. One time I wrote an article for a company as part of a job application and they published my article without hiring or contacting me… They never responded to any of my emails.

    Copyright infringement is one of the most annoying things to me as a writer. However, I try not to waste too much of my time fighting it because nothing seems to work. I’d like to see a blog post about our options as writers. Like, should I try to report the site to the BBB? Is there a website where I can flag websites that scrape content? Are there phrases I can use in emails that are the most effective in getting my content removed or myself compensated in some way?

    • says

      That’s terrible that a site took your post and didn’t even hire you! Those are great questions too. I’m going to have to reach out to Rich and see if he will answer some more of my questions. Seems like a lot of people are interesting in learning more about this.

  4. says

    Raubi, saw this post from the email. It’s very discouraging, for SEO to be effective you need to create fantastic content that is people want to share and link to – and not being able to go after scrapers and ‘regurgitators’ who use your content without proper linking and attribution makes it less cost effective.

    At least Google author attribution helps and it seems the message is to push on with creating lots of good stuff and hope some rules are applied by the online cop of the wild west – the panda, penguin etc maker. ;-)

  5. Rooken says

    By definition then aren’t facebook users who repost videos, articles and images guilty of copyright infringement and “thin affiliates”…

    • says

      I’m not really sure how it works with social sites. I know Pinterest has had the biggest issue with infringement issues in the past.

      But the big difference I see is that most social sites don’t claim attribution on the content they repost. That’s the part I personally have an issue with. Reusing or repurposing content is one thing, but reposting and then taking credit is where I have an issue.

  6. Aaron Johnson says

    I don’t scrape content. I write my own blog posts, and articles. I license my work under creative commons license attribution. I write unique content that I create by default.

Trackbacks

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>