Close
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Video
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Latest News

      Training AI with 139,000 Scripts: The Massive Data Set Angering Hollywood Writers

      Written by

      Kelvene Requiroso
      Published December 16, 2024
      Share
      Facebook
      Twitter
      Linkedin
        Hollywood sign and cityscape.

        eWEEK content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

        More than 139,000 TV and film scripts were converted into datasets and used to train AI models by Apple, Anthropic, Meta, and Nvidia without the knowledge of their authors, raising fears that their creative work is being used to train machines that could potentially replace them.

        In addition to the 39,000 TV and film titles, more than 53,000 additional movies and 83,000 TV episodes were used to train AI, including a vast array of Best Picture nominees and TV episodes of The Simpsons, Seinfeld, Twin Peaks, The Wire, The Sopranos, and Breaking Bad.

        The dataset “even includes prewritten ‘live’ dialogue from Golden Globes and Academy Awards broadcasts,” said the Atlantic’s Alex Reisner, who broke the story.

        Dialogues as Datasets

        The datasets used to train the AI models did not comprise the original scripts, but subtitles extracted, compiled, and uploaded to OpenSubtitles.org. Using subtitles instead of the more technical scripts is more concerning to some critics as subtitles offer a more natural flow of language used in conversation.

        Generative AI models trained on well-written dialogue could not only mimic films but generate new ones entirely, which means AI could conceivably compete with the human writers on whose works it trained without their permission. This lack of transparency by AI companies has prompted artists, authors, and publishers to file lawsuits to defend the intellectual property rights of their creative outputs.

        “For as long as generative-AI chatbots have been on the internet, Hollywood writers have wondered if their work has been used to train them.” Reisner wrote. “The chatbots are remarkably fluent with movie references, and companies seem to be training them on all available sources.” He created a search tool for the Hollywood AI database to help writers determine whether their work was used.

        Response from Scriptwriters

        Unhappy to learn about the alleged theft of their work, Hollywood writers responded angrily, as the WGA and SAG-AFTRA unions have contended the use of AI in recent strikes.

        “I’m livid,” said David Slack, who wrote the TV show Teen Titans. “I’m completely outraged. It’s disgusting.” Slack discovered 42 scripts credited to him in the AI database. “It’s a huge amount of my work . . . These are things that I poured my heart and soul into.” Other popular writers whose work was used to train AI included Grey’s Anatomy creator Shonda Rhimes, who had 508 episodes in the dataset; American Horror Story creator Ryan Murphy, who had 346; and Matt Groening—who created The Simpsons and Futurama—who had 742 episodes.

        AI’s lack of intentionality makes it unable to produce creative works solely on its own—rather, it relies on the work of human authors in a way that many consider plagiarism. However, the issue is even more complex, because in many cases, the studios own the copyrights of the scripts rather than the writers, giving them even less agency for legal recourse or compensation.

        Learn more about the complex legal, ethical, and privacy issues surrounding generative AI technology.

        Kelvene Requiroso
        Kelvene Requiroso
        Having a wide range of interests, Kelvene has taught both on campus and online, immersed himself in advocacy and development work, and published numerous reviews and analyses of the latest technologies. Kelvene co-founded ISMS Robotics.
        Facebook

        Get the Free Newsletter!

        Subscribe to Daily Tech Insider for top news, trends & analysis

        Get the Free Newsletter!

        Subscribe to Daily Tech Insider for top news, trends & analysis

        MOST POPULAR ARTICLES

        Artificial Intelligence

        9 Best AI 3D Generators You Need...

        Sam Rinko - June 25, 2024 0
        AI 3D Generators are powerful tools for many different industries. Discover the best AI 3D Generators, and learn which is best for your specific use case.
        Read more
        Cloud

        RingCentral Expands Its Collaboration Platform

        Zeus Kerravala - November 22, 2023 0
        RingCentral adds AI-enabled contact center and hybrid event products to its suite of collaboration services.
        Read more
        Artificial Intelligence

        8 Best AI Data Analytics Software &...

        Aminu Abdullahi - January 18, 2024 0
        Learn the top AI data analytics software to use. Compare AI data analytics solutions & features to make the best choice for your business.
        Read more
        Latest News

        Zeus Kerravala on Networking: Multicloud, 5G, and...

        James Maguire - December 16, 2022 0
        I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
        Read more
        Video

        Datadog President Amit Agarwal on Trends in...

        James Maguire - November 11, 2022 0
        I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
        Read more
        Logo

        eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

        Facebook
        Linkedin
        RSS
        Twitter
        Youtube

        Advertisers

        Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

        Advertise with Us

        Menu

        • About eWeek
        • Subscribe to our Newsletter
        • Latest News

        Our Brands

        • Privacy Policy
        • Terms
        • About
        • Contact
        • Advertise
        • Sitemap
        • California – Do Not Sell My Information

        Property of TechnologyAdvice.
        © 2024 TechnologyAdvice. All Rights Reserved

        Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

        ×