Gemini 1.5 is Google’s next-gen AI model — and it’s already almost ready (6 minute read)

Google has launched Gemini 1.5, making it available to developers and enterprise users ahead of a full consumer rollout. The new Gemini is apparently on par with the high-end Gemini Ultra that Google recently launched. It was created using the Mixture of Experts technique, making the model both faster for users and more efficient for Google to run. Gemini 1.5 has a context window of 1 million tokens, equivalent to tens of thousands of lines of code. Google researchers are currently testing a 10 million context window.

Bluesky and Mastodon users are having a fight that could shape the next generation of social media (7 minute read)

A developer recently released a bridge to connect Bluesky's AT Protocol with Mastodon's ActivityPub. The networks use two different protocols, meaning that their users can't natively interact. The release appeared to be controversial - it triggered an intense debate on the project's GitHub issue page. Integrating the protocols together could mean that users' posts show up in places they didn't anticipate. The way that these protocols interact with one another could set the stage for the next era of the internet.

OpenAI’s new Sora model can generate minute-long videos from text prompts (4 minute read)

Sora will be made available to a small group of academics and researchers who will assess its potential for harm and misuse.

Google goes “open AI” with Gemma, a free, open-weights chatbot family (2 minute read)

Google's new family of AI language models, Gemma, are free open-weights models built on technology similar to Gemini. Gemma models can run locally on a desktop or laptop computer. There are two models, one with 2 billion parameters and the other with 7 billion. Each model has pre-trained and instruction-tuned variants available. Google claims the 7 billion parameter model outperforms Meta's Llama 2 7B and 13B models on several benchmarks in math, Python code generation, general knowledge, and commonsense reasoning tasks. This is Google's first significant open large language model release since OpenAI's ChatGPT was revealed in late 2022.

Reddit has a new AI training deal to sell user content (2 minute read)

Reddit has signed a deal with an unnamed large AI company for access to its user-generated content platform. The deal is worth about $60 million annually. It is still subject to change as Reddit's plans to go public are still in the works. Companies like OpenAI and Apple are also seeking to make deals with other platforms to use their data to train AI models.

Raising children on the eve of AI (8 minute read)

The future is hard to imagine, but we shouldn't be afraid of it.

ChatGPT’s Growth Is Flatlining (4 minute read)

ChatGPT has seen declining web traffic in five of the past eight months. It is currently down 11% from its May 2023 peak. Its mobile app has seen fewer users than Snapchat added last quarter alone. The competition in the AI space means that OpenAI will have to pump out hits to stay ahead.

Mind-reading devices are revealing the brain’s secrets (15 minute read)

The development of brain-computer interface (BCI) technology has allowed scientists to learn lessons about the brain that are overturning assumptions about brain anatomy. BCIs allow scientists to record single-neuron activity for a lot of brain areas that were previously not accessible. They also allow for longer measurements compared to classical tools. Advances in artificial intelligence, decoding tools, and hardware have propelled the field forward. This article discusses some of the findings that BCI technology has enabled.

ReadySet (GitHub Repo)

ReadySet is a transparent database cache for Postgres and MySQL. It provides the performance and scalability of an in-memory key-value store without requiring users to rewrite their apps or manually handle cache invalidation. ReadySet can turn even the most complex SQL reads into lightning-fast lookups. It keeps cached query results in sync with databases automatically by utilizing the database's replication stream. ReadySet can be used along with an existing ORM or database client.

The killer app of Gemini Pro 1.5 is video (13 minute read)

Gemini Pro 1.5 is an enormous upgrade for the Gemini series. The model has a 1,000,000 token context size, much larger than Claude 2.1's 200,000 and gpt-4-turbo's 128,000 token context sizes. While the model can still miss things and hallucinate incorrect details, it is able to process and extract text information from short videos. This article contains an example where Gemini Pro 1.5 is used to extract book names from a short video.

This week in dev - 2/19 - 2/23