@hillbig
Co-founder and Representative Director of Preferred Networks (PFN). CEO of Preferred Computational Chemistry and Preferred Element. Interested in AI and science
@SAsomedey
@galgitron
@UNA_NIKA
@Tip_____xx
@Beti_Satguru_Ki
@sotwecom
@Akechi_Toumabot
@NN_8888_n
@tchourmeni
@rody_
@ireinnv
@doll_kavipriya
@snlester
@heather_us93328
@NN0URR
@lavchase
@charis_ellwo
@nchrysoloras
@TWO__TIME
@OvuncAslan
We train a neural network to compute function integrals, and to solve complex differential equations.
深層学習の数理 - Download as a PDF or view online for free
Deep neural networks (DNNs) have demonstrated dominating performance in many fields; since AlexNet, networks used in practice are going wider and deeper. On the theoretical side, a long line of...
We show how the success of deep learning could depend not only on mathematics but also on physics: although well-known mathematical theorems guarantee that neural networks can approximate...
Least-mean squares (LMS) solvers such as Linear / Ridge / Lasso-Regression, SVD and Elastic-Net not only solve fundamental machine learning problems, but are also the building blocks in a variety...
Nouha Dziri, Sivan Milton, Mo Yu, Osmar Zaiane, Siva Reddy. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technol...
A closer look at how Temporal Difference Learning merges paths of experience for greater statistical efficiency
Convolutional neural networks are witnessing wide adoption in computer vision systems with numerous applications across a range of visual recognition tasks. Much of this progress is fueled through...
新型コロナワクチン開発の舞台裏を、ファイザー会長兼CEOが自ら語る。著者のビジョンあるリーダーシップのもと、ファイザーの科学者たちとパートナーのビオンテック社が共闘した2020年の濃密な9か月間。それは、彼らが安全で有効な新型コロナワクチンの開発、治験、製造という従来であれば何年もかかるプロセスをわずか9か月で達成し、「不可能を可能にする」までの物語だった。
More info: https://deepmind.com/blog/neural-scene-representation-and-rendering/
Supplementary Video for paper:Grokking Happens All the Time and Here is Whybit.ly/grok-adversarialWe train a 4 layer MLP with 256 width, on 1000 training sam...
The goal of this series is to chronicle opinions and issues in the field of machine learning as they stand today and as they change over time. The plan is to host this survey periodically until...
Sorting an array is a fundamental routine in machine learning, one that is used to compute rank-based statistics, cumulative distribution functions (CDFs), quantiles, or to select closest...
A method for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views.
Recurrent neural networks (RNNs) are particularly well-suited for modeling long-term dependencies in sequential data, but are notoriously hard to train because the error backpropagated in time...
The framework of reinforcement learning or optimal control provides a mathematical formalization of intelligent decision making that is powerful and broadly applicable. While the general form of...
We propose semantic entropy probes (SEPs), a cheap and reliable method for uncertainty quantification in Large Language Models (LLMs). Hallucinations, which are plausible-sounding but factually...
We empirically study a simple layer-pruning strategy for popular families of open-weight pretrained LLMs, finding minimal degradation of performance on different question-answering benchmarks...
We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before...
世界レベルのAIの動向を知るためのコラム。ファナックやトヨタ自動車が出資するAIベンチャー Preferred Networksの共同創業者、岡野原大輔氏がAIの先端動向を解説しています。
How does the neocortex learn and develop the foundations of all our high-level cognitive abilities? We present a comprehensive framework spanning biological, computational, and cognitive levels,...
Our AI system surpasses the state-of-the-art approach for geometry problems, advancing AI reasoning in mathematics
The success of deep learning in vision can be attributed to: (a) models with high capacity; (b) increased computational power; and (c) availability of large-scale labeled data. Since 2012, there...
Neural network pruning and quantization techniques are almost as old as neural networks themselves. However, to date only ad-hoc comparisons between the two have been published. In this paper, we...
We present evidence that language models (LMs) of code can learn to represent the formal semantics of programs, despite being trained only to perform next-token prediction. Specifically, we train...
Neural networks are known to be a class of highly expressive functions able to fit even random input-output mappings with $100\%$ accuracy. In this work, we present properties of neural networks...
‘Blind’ AI navigation agents (with only egomotion sensing) can learn to navigate new environments and build map-like representations (supporting the ability to take shortcuts, follow walls, and...
Labs
Neural network pruning techniques can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without...
Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single...
Recent language models generate false but plausible-sounding text with surprising frequency. Such "hallucinations" are an obstacle to the usability of language-based AI systems and can harm people...
Leading researchers in Geometric & Graph ML summarise the progress in 2021 and make predictions for 2022
In this study, we examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation. Our philosophy is to deconstruct a DDM,...
This blog is intended to be a place to share ideas and results that are too weird, incomplete, or off-topic to turn into an academic paper, but that I think may be important. Let me know what you...
Deep Learning Practice and Theory - Download as a PDF or view online for free
We examine gradient descent on unregularized logistic regression problems, with homogeneous linear predictors on linearly separable datasets. We show the predictor converges to the direction of...
Batch normalization (BatchNorm) has become a standard technique in deep learning. Its popularity is in no small part due to its often positive effect on generalization. Despite this success, the...
Transformers are considered conceptually different from the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only...
How can we explain the predictions of a black-box model? In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the...
Deep reinforcement learning (RL) methods have driven impressive advances in artificial intelligence in recent years, exceeding human performance in domains ranging from Atari to Go to no-limit poker....
The over-parameterized models attract much attention in the era of data science and deep learning. It is empirically observed that although these models, e.g. deep neural networks, over-fit the...
追記 PLaMo β版トライアルのご利用申し込みおよび参加者ログインはこちら: https://plamo.preferredai.jp/ 株式会社Preferred Networks(本社:東京都千代田区、代表取締役 […]
Many domains of science have developed complex simulations to describe phenomena of interest. While these simulations provide high-fidelity models,...
Preferred Networksでは、9月28日にPLaMo-13Bという大規模な言語モデル (LLM)
Recently, Multimodal Large Language Model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform multimodal...
Recent empirical works have successfully used unlabeled data to learn feature representations that are broadly useful in downstream classification tasks. Several of these methods are reminiscent...
Despite impressive performance on numerous visual tasks, Convolutional Neural Networks (CNNs) --- unlike brains --- are often highly sensitive to small perturbations of their input, e.g....
While the optimization problem behind deep neural networks is highly non-convex, it is frequently observed in practice that training deep networks seems possible without getting stuck in...
Machine learning is predicated on the concept of generalization: a model achieving low error on a sufficiently large training set should also perform well on novel samples from the same...
The lottery ticket hypothesis (Frankle and Carbin, 2018), states that a randomly-initialized network contains a small subnetwork such that, when trained in isolation, can compete with the...
We study finite sample expressivity, i.e., memorization power of ReLU networks. Recent results require $N$ hidden nodes to memorize/interpolate arbitrary $N$ data points. In contrast, by...
What is the best paradigm to recognize objects---discriminative inference (fast but potentially prone to shortcut learning) or using a generative model (slow but potentially more robust)? We build...
We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less...
PFNは、2023年9月28日のお知らせのとおり、マルチモーダル基盤モデルの開発、販売を行う、株式会社Preferred Elements(本社:東京都千代田区、代表取締役社長:岡野原大輔、プリファードエレメンツ、以下、 […]
In order to integrate uncertainty estimates into deep time-series modelling, Kalman Filters (KFs) (Kalman et al., 1960) have been integrated with deep learning models, however, such approaches...
We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8...
私が2012年にニューラルネットの逆襲(当時のコメント)というのをブログに書いてからちょうど5年が経ちました。当時はまだDeep
Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual...
このたび、株式会社Preferred Networks(PFN)代表取締役社長 西川徹、代表取締役副社長 岡野原大輔は、初の著書となる『Learn or Die 死ぬ気で学べ プリファードネットワークスの挑戦』を刊行する […]
GoogleのAlphaGoによるプロ棋士打破は,人工知能がヒトを超えた学習を行った歴史的出来事として認識された。強化学習はここで重要な役割を果たしてているだけでなく,自動運転やロボット制御などの重要な分野への応用も知られ,いま世間の強い関心を集めている。その一方,日本語で強化学習を体系的に学べる教科書は多くはなく,代表的な教科書であるSutton and Barto (1998)とその訳書...
We propose a differentiable family of "kaleidoscope matrices," prove that all structured matrices can be represented in this form, and use them to replace hand-crafted linear maps in deep learning...
We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub...
Unsupervised image-to-image translation is an inherently ill-posed problem. Recent methods based on deep encoder-decoder architectures have shown impressive results, but we show that they only...
Fine-grained image labels are desirable for many computer vision applications, such as visual search or mobile AI assistant. These applications rely on image classification models that can produce...
In deep learning, \textit{depth}, as well as \textit{nonlinearity}, create non-convex loss surfaces. Then, does depth alone create bad local minima? In this paper, we prove that without...
Recent studies have discovered that Chain-of-Thought prompting (CoT) can dramatically improve the performance of Large Language Models (LLMs), particularly when dealing with complex tasks...