Connect with us

Hi, what are you looking for?

Rise In The FutureRise In The Future

Tech News

OpenAI transcribed over a million hours of YouTube videos to train GPT-4

Photo illustration of the shape of a brain on a circuitboard.
Cath Virginia / The Verge | Photos from Getty Images

Earlier this week, The Wall Street Journal reported that AI companies were running into a wall when it comes to gathering high-quality training data. Today, The New York Times detailed some of the ways companies have dealt with this. Unsurprisingly, it involves doing things that fall into the hazy gray area of AI copyright law.

The story opens on OpenAI which, desperate for training data, reportedly developed its Whisper audio transcription model to get over the hump, transcribing over a million hours of YouTube videos to train GPT-4, its most advanced large language model. That’s according to The New York Times, which reports that the company knew this was legally questionable but believed it to be fair use. OpenAI president Greg…

Continue reading…

You May Also Like

Editor's Pick

In this edition of StockCharts TV‘s The Final Bar, Dave and guest Julius de Kempenaer of RRG Research talk sector rotation, growth vs. value, offense vs....

Tech News

Photo by Jakub Porzycki/NurPhoto via Getty Images Intel’s chip-making division accumulated $7 billion in operating losses in 2023, Reuters reported on Tuesday. That’s a...

Tech News

Screenshot: Wes Davis / The Verge There’s a little-known hack in rural America: you can get the best fried food at the gas station...

Tech News

Photo by Amelia Holowaty Krales / The Verge Linux, the most widely used open source operating system in the world, narrowly escaped a massive...