Browse Tag

Machine Learning

China’s DeepSeek Unveils AI Model Halving Costs by 50% – The ‘Sparse Attention’ Revolution

China’s DeepSeek Unveils AI Model Halving Costs by 50% – The ‘Sparse Attention’ Revolution

DeepSeek’s New V3.2-Exp Model Hangzhou-based DeepSeek burst onto the AI scene earlier in 2025 with its R1 model (a heavily RL-trained chatbot) techcrunch.com. This time, DeepSeek’s announcement focuses on efficiency. On Sept. 29 the company published a post (on Hugging Face) unveiling DeepSeek-V3.2-Exp, an experimental large language model built on its V3 series techcrunch.com techcrunch.com. According to DeepSeek, V3.2-Exp maintains similar reasoning performance to V3.1 but uses far less compute for long inputs. The key innovation is a “DeepSeek Sparse Attention” (DSA) mechanism: rather than comparing every token to every other in a long document (the dense attention used by
AI Frenzy: Billion-Dollar Bets, Self-Learning Breakthroughs & Legal Showdowns – Global AI Roundup (Aug 13–14, 2025)

AI Frenzy: Billion-Dollar Bets, Self-Learning Breakthroughs & Legal Showdowns – Global AI Roundup (Aug 13–14, 2025)

Meta says its latest AI systems are beginning to improve themselves without human intervention, a step Mark Zuckerberg called toward artificial superintelligence, with a policy memo noting “glimpses” of self-improvement. Zhipu (Z AI) open-sourced GLM-4.5, a 355-billion-parameter model built on mixture-of-experts, ranking third globally on reasoning and coding benchmarks, as China reports over 1,500 domestically developed large AI models. Google pledged $9 billion to expand its U.S. AI infrastructure, including a new data center campus in Oklahoma, raising annual capex to $85 billion and committing $1 billion for AI training in more than 100 U.S. colleges. Cisco reported $2 billion
14 August 2025
How Machine Learning Works and Why It’s Changing Everything in 2025

How Machine Learning Works and Why It’s Changing Everything in 2025

Supervised learning trains on labeled data to predict outputs, enabling tasks like spam detection, house-price prediction, and cat image recognition with labeled photos. Unsupervised learning finds structure without labels, enabling clustering of news articles by topic and customer segmentation into similar behavior groups. Reinforcement learning trains an agent by interacting with an environment to maximize rewards, powering autonomous decisions in self-driving cars and game AI. Neural networks, especially deep neural networks, consist of multiple layers of interconnected nodes that progressively extract higher-level features for image, speech, and language tasks. Overfitting occurs when a model memorizes training data and fails to
The Surprising Science Behind Neural Networks: How They Work and How to Build Your Own (Beginner’s Guide)

The Surprising Science Behind Neural Networks: How They Work and How to Build Your Own (Beginner’s Guide)

Neural networks are brain-inspired models that learn from data by adjusting weights and biases, not by hand-coding every rule. They consist of layers—input, one or more hidden layers, and an output layer—where each neuron connects to the next layer. Each connection has a weight and each neuron includes a bias, and these are the learnable parameters adjusted during training. Activation functions such as sigmoid, ReLU, and tanh introduce non-linearity enabling the network to model complex patterns. A forward pass moves data from the input layer through hidden layers to the output, computing weighted sums, adding biases, and applying activations. Training
32B AI Model Trained by a Swarm of Volunteer GPUs – Inside INTELLECT-2’s Decentralized Revolution

32B AI Model Trained by a Swarm of Volunteer GPUs – Inside INTELLECT-2’s Decentralized Revolution

In May 2025, Prime Intellect unveiled INTELLECT-2, a 32B-parameter LLM trained on a globally distributed swarm of volunteer GPUs. INTELLECT-2 is the first model of its scale trained via fully asynchronous reinforcement learning across hundreds of heterogeneous, permissionless machines on chakra.dev. PRIME-RL coordinates the training loop by separating experience generation on inference workers from policy updates on trainer nodes, eliminating traditional synchronization bottlenecks. SHARDCAST distributes the 32B model weights via a tree-based, pipelined transfer to hundreds of volunteer GPUs, accelerating weight propagation. TOPLOC provides a probabilistic verification fingerprint using locality-sensitive hashing to detect tampering or incorrect results from untrusted nodes.
Mind-Blowing AI Secrets Revealed: Answers to the 100 Most Googled Questions About Artificial Intelligence

Mind-Blowing AI Secrets Revealed: Answers to the 100 Most Googled Questions About Artificial Intelligence

ChatGPT is a form of AI, a large language model developed by OpenAI that was publicly released in November 2022. GPT-4, released in 2023, scored in the top 10% on a simulated bar exam for lawyers. Siri and Alexa are AI-powered virtual assistants that rely on natural language processing and machine learning to understand voice commands. Google uses AI under the hood for search ranking and language understanding and has launched its own AI chatbot Bard, with models like BERT to interpret queries. AI can be categorized into four types: Reactive Machines (example Deep Blue), Limited Memory AI, Theory of
Latest Developments in AI (June–July 2025)

Latest Developments in AI (June–July 2025)

In mid-June 2025, OpenAI CEO Sam Altman indicated that an anticipated open-source AI model would be delayed to later in the summer, not June. On June 30, 2025, Mark Zuckerberg announced Meta’s Meta Superintelligence Labs, naming Alexandr Wang as Chief AI Officer and Nat Friedman as a partner, with Meta hiring 11 engineers from Anthropic, Google DeepMind, and OpenAI that month. In June 2025, Google began integrating its Gemini AI into consumer apps with parental controls, while Microsoft expanded AI copilots across Windows and Office. In late June 2025, Air Canada refunded a customer after its AI chatbot provided incorrect
AI News Roundup – June 28, 2025

AI News Roundup – June 28, 2025

Meta hired Trapit Bansal, a key OpenAI researcher, and acquired a 49% stake in Scale AI valued at nearly $15 billion, while securing a 1.1 GW nuclear power supply for AI data centers starting in 2027. Amazon’s stock has nearly doubled in the past three years as AWS commands about 30% of the global cloud market, fueling AI‑driven growth across its businesses. Salesforce launched Agentforce 3, upgrading its AI‑driven support platform with a live‑monitoring command center and an Agent Exchange marketplace of over 100 pre‑built automations, boosting AI adoption 233% in six months. Perplexity Labs added features turning its AI
Go toTop