神经网络 on s-ai-unix's Blog

神经网络 on s-ai-unix's Bloghttps://s-ai-unix.github.io/tags/%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C/Recent content in 神经网络 on s-ai-unix's BlogHugo -- 0.161.1zh-cnMon, 16 Feb 2026 10:32:13 +0800Andrej Karpathy 的 minGPT：300行代码读懂GPT原理https://s-ai-unix.github.io/posts/2026-02-16-mingpt-300-lines-gpt/Mon, 16 Feb 2026 10:32:13 +0800https://s-ai-unix.github.io/posts/2026-02-16-mingpt-300-lines-gpt/用300行Python代码，深入浅出解读GPT核心原理，从自注意力机制到生成过程AI 论文解读系列：The Llama 3 Herd of Models —— 开源大模型的巅峰之作https://s-ai-unix.github.io/posts/2026-01-31-ai-paper-llama3-herd-of-models/Sat, 31 Jan 2026 09:30:00 +0800https://s-ai-unix.github.io/posts/2026-01-31-ai-paper-llama3-herd-of-models/深入解读 Meta AI 的 Llama 3 论文，从 Scaling Laws、模型架构到多模态扩展，全面剖析这个拥有 405B 参数的开源大模型集群的设计理念与技术细节。AI 论文解读系列：AlphaGo - 深度学习与树搜索征服围棋https://s-ai-unix.github.io/posts/2026-01-30-alphago-paper-interpretation/Fri, 30 Jan 2026 12:30:00 +0800https://s-ai-unix.github.io/posts/2026-01-30-alphago-paper-interpretation/深入解读 DeepMind 发表于 Nature 的里程碑论文，剖析 AlphaGo 如何结合深度神经网络与蒙特卡洛树搜索，首次在围棋领域击败人类职业棋手AI 论文解读系列：Inception-v4 - Going Deeper with Convolutionshttps://s-ai-unix.github.io/posts/2026-01-30-ai-paper-interpretation-series-inception-v4-going-deeper-with-convolutions/Fri, 30 Jan 2026 12:30:00 +0800https://s-ai-unix.github.io/posts/2026-01-30-ai-paper-interpretation-series-inception-v4-going-deeper-with-convolutions/深入解读 Google 的 Inception-v4 论文，从 Inception 系列的演进历程出发，剖析 Inception-v4 的架构设计思想、多尺度特征提取原理，以及 Inception-ResNet 如何将残差连接与 Inception 模块融合，创造当时最强图像分类网络。AI 论文解读系列：Seq2Seq--从序列到序列的革命https://s-ai-unix.github.io/posts/2026-01-30-seq2seq-paper-explained/Fri, 30 Jan 2026 09:00:00 +0800https://s-ai-unix.github.io/posts/2026-01-30-seq2seq-paper-explained/深入浅出解读 Seq2Seq 论文，从机器翻译的困境到编码器-解码器架构的突破，揭示深度学习处理序列数据的核心思想。AI 论文解读系列：GPT-3——当语言模型学会举一反三https://s-ai-unix.github.io/posts/2026-01-30-gpt3-few-shot-learners-paper/Fri, 30 Jan 2026 08:50:00 +0800https://s-ai-unix.github.io/posts/2026-01-30-gpt3-few-shot-learners-paper/深入解读 OpenAI 里程碑式论文 GPT-3: Language Models are Few-Shot Learners，从 Transformer 架构到少样本学习的范式转变，探讨大规模语言模型的涌现能力与未来前景。AI 论文解读系列：ResNet 深度残差学习https://s-ai-unix.github.io/posts/2026-01-30-ai-paper-interpretation-series-resnet-deep-residual-learning/Fri, 30 Jan 2026 08:38:11 +0800https://s-ai-unix.github.io/posts/2026-01-30-ai-paper-interpretation-series-resnet-deep-residual-learning/深入解读何恺明等人的 ResNet 论文，从深度网络的退化问题出发，剖析残差学习的核心思想、数学原理和架构设计，揭示为何简单的跳跃连接能够训练出超深层神经网络。AlexNet：开启深度学习革命的里程碑https://s-ai-unix.github.io/posts/2026-01-29-alexnet-deep-learning-revolution/Thu, 29 Jan 2026 06:00:00 +0800https://s-ai-unix.github.io/posts/2026-01-29-alexnet-deep-learning-revolution/深入浅出解析 AlexNet 的架构原理、关键技术创新和历史意义，从 ImageNet 挑战到深度学习革命，完整推导其数学原理变分自编码器：从概率建模到深度生成的优雅桥梁https://s-ai-unix.github.io/posts/2026-01-24-variational-autoencoder/Sat, 24 Jan 2026 18:30:00 +0800https://s-ai-unix.github.io/posts/2026-01-24-variational-autoencoder/深入解析变分自编码器（VAE）的数学原理与推导，从变分推断到 ELBO 优化，从重参数化到生成应用，完整呈现 VAE 的理论框架与实践价值生成对抗网络：从混沌中创造秩序的博弈论https://s-ai-unix.github.io/posts/2026-01-24-gan-comprehensive-guide/Sat, 24 Jan 2026 11:45:00 +0800https://s-ai-unix.github.io/posts/2026-01-24-gan-comprehensive-guide/深入探讨生成对抗网络（GAN）的数学原理、训练挑战与应用前景Transformer：重塑AI世界的架构革命https://s-ai-unix.github.io/posts/2026-01-21-transformer/Wed, 21 Jan 2026 10:00:00 +0800https://s-ai-unix.github.io/posts/2026-01-21-transformer/深入解读 Transformer 架构的核心原理，从自注意力机制到多头注意力，探索这个重塑 AI 世界的重要架构感知机的完整发展历程：从线性分类到深度学习的基石https://s-ai-unix.github.io/posts/2026-01-21-perceptron-development-history/Wed, 21 Jan 2026 08:00:00 +0800https://s-ai-unix.github.io/posts/2026-01-21-perceptron-development-history/系统综述感知机的发展历程，从早期的线性分类器到现代深度学习的基础，注重背景和演变过程的介绍，通俗易懂。神经网络算法演进：从感知机到 Transformer 的七十年征程https://s-ai-unix.github.io/posts/2026-01-15-neural-network-evolution/Thu, 15 Jan 2026 23:55:00 +0800https://s-ai-unix.github.io/posts/2026-01-15-neural-network-evolution/回顾神经网络七十年发展历程，从感知机到 Transformer，详解核心算法的数学原理大语言模型：为什么AI能这么快、这么聪明地回答问题https://s-ai-unix.github.io/posts/2026-01-14-llm-principle-for-students/Wed, 14 Jan 2026 08:50:00 +0800https://s-ai-unix.github.io/posts/2026-01-14-llm-principle-for-students/从预测下一个词的简单想法出发，解释大语言模型的工作原理，面向初中生和高中生的通俗易懂指南。梯度、梯度下降与反向传播：从最优化到深度学习的数学引擎https://s-ai-unix.github.io/posts/2026-01-14-gradient-descent-backpropagation-overview/Wed, 14 Jan 2026 08:34:44 +0800https://s-ai-unix.github.io/posts/2026-01-14-gradient-descent-backpropagation-overview/系统介绍梯度、梯度下降、反向传播算法，以及梯度的其他应用，完整推导历史背景与应用场景，并详细对比梯度、散度、旋度三个核心概念。基于神经网络的深度学习算法：从感知机到Transformer的完整指南https://s-ai-unix.github.io/posts/2026-01-14-deep-learning-algorithms-comprehensive-guide/Wed, 14 Jan 2026 08:30:00 +0800https://s-ai-unix.github.io/posts/2026-01-14-deep-learning-algorithms-comprehensive-guide/本文全面回顾深度学习算法的发展历程、数学原理、架构演进及未来前景，涵盖从基础神经网络到Transformer的完整演进路径。