1. The document discusses energy-based models (EBMs) and how they can be applied to classifiers. It introduces noise contrastive estimation and flow contrastive estimation as methods to train EBMs.
2. One paper presented trains energy-based models using flow contrastive estimation by passing data through a flow-based generator. This allows implicit modeling with EBMs.
3. Another paper argues that classifiers can be viewed as joint energy-based models over inputs and outputs, and should be treated as such. It introduces a method to train classifiers as EBMs using contrastive divergence.
【DL輪読会】Efficiently Modeling Long Sequences with Structured State SpacesDeep Learning JP
This document summarizes a research paper on modeling long-range dependencies in sequence data using structured state space models and deep learning. The proposed S4 model (1) derives recurrent and convolutional representations of state space models, (2) improves long-term memory using HiPPO matrices, and (3) efficiently computes state space model convolution kernels. Experiments show S4 outperforms existing methods on various long-range dependency tasks, achieves fast and memory-efficient computation comparable to efficient Transformers, and performs competitively as a general sequence model.
ベイズ最適化によるハイパーパラメータ探索についてざっくりと解説しました。
今回紹介する内容の元となった論文
Bergstra, James, et al. "Algorithms for hyper-parameter optimization." 25th annual conference on neural information processing systems (NIPS 2011). Vol. 24. Neural Information Processing Systems Foundation, 2011.
https://hal.inria.fr/hal-00642998/
Several recent papers have explored self-supervised learning methods for vision transformers (ViT). Key approaches include:
1. Masked prediction tasks that predict masked patches of the input image.
2. Contrastive learning using techniques like MoCo to learn representations by contrasting augmented views of the same image.
3. Self-distillation methods like DINO that distill a teacher ViT into a student ViT using different views of the same image.
4. Hybrid approaches that combine masked prediction with self-distillation, such as iBOT.
【DL輪読会】NeRF-VAE: A Geometry Aware 3D Scene Generative ModelDeep Learning JP
NeRF-VAE is a 3D scene generative model that combines Neural Radiance Fields (NeRF) and Generative Query Networks (GQN) with a variational autoencoder (VAE). It uses a NeRF decoder to generate novel views conditioned on a latent code. An encoder extracts latent codes from input views. During training, it maximizes the evidence lower bound to learn the latent space of scenes and allow for novel view synthesis. NeRF-VAE aims to generate photorealistic novel views of scenes by leveraging NeRF's view synthesis abilities within a generative model framework.
ベイズ最適化によるハイパーパラメータ探索についてざっくりと解説しました。
今回紹介する内容の元となった論文
Bergstra, James, et al. "Algorithms for hyper-parameter optimization." 25th annual conference on neural information processing systems (NIPS 2011). Vol. 24. Neural Information Processing Systems Foundation, 2011.
https://hal.inria.fr/hal-00642998/
Several recent papers have explored self-supervised learning methods for vision transformers (ViT). Key approaches include:
1. Masked prediction tasks that predict masked patches of the input image.
2. Contrastive learning using techniques like MoCo to learn representations by contrasting augmented views of the same image.
3. Self-distillation methods like DINO that distill a teacher ViT into a student ViT using different views of the same image.
4. Hybrid approaches that combine masked prediction with self-distillation, such as iBOT.
【DL輪読会】NeRF-VAE: A Geometry Aware 3D Scene Generative ModelDeep Learning JP
NeRF-VAE is a 3D scene generative model that combines Neural Radiance Fields (NeRF) and Generative Query Networks (GQN) with a variational autoencoder (VAE). It uses a NeRF decoder to generate novel views conditioned on a latent code. An encoder extracts latent codes from input views. During training, it maximizes the evidence lower bound to learn the latent space of scenes and allow for novel view synthesis. NeRF-VAE aims to generate photorealistic novel views of scenes by leveraging NeRF's view synthesis abilities within a generative model framework.
Derivative of sine function: A graphical explanationHideo Hirose
The derivative of a sine function can be derived by using the limit for sine function. However, it seems difficult to understand this transformation. Thus, I have drawn a figure expressing the differentiation.
sine関数微分 d sin x / dx = cos x の説明は、sineの差の公式を積に変換して、sin x / x → 1 (x → 0) を使って説明されることが多い。
ここでは、図形的に示してみた。sin x / x → 1 (x → 0) が見えないだけになっているが、結局は、、、
Success/Failure Prediction for Final Examination using the Trend of Weekly On...Hideo Hirose
H. Hirose, Success/Failure Prediction for Final Examination using the Trend of Weekly Online Testing, 7th International Conference on Learning Technologies and Learning Environments (LTLE2018), pp.139-145, July 8-12, 2018.
Attendance to Lectures is Crucial in Order Not to Drop OutHideo Hirose
H. Hirose, Attendance to Lectures is Crucial in Order Not to Drop Out, 7th International Conference on Learning Technologies and Learning Environments (LTLE2018), pp.194-198, July 8-12, 2018.
How many times are we tossing coins until we observe head, tail, and head? It's ten. It's not eight. This is an intriguing result against our intuition.
Interesting but difficult problem: find the optimum saury layout on a gridiro...Hideo Hirose
Even though we use a simple but ridiculous problem finding the optimum saury baking layout on a fish gridiron by Joule heat, we can invoke the interest to science by combining electrical engineering, linear algebra and probability viewpoints. These elements are, use of solving linear equation and Poisson's equation, and applying the central limit theorem to this situation. In addition, by removing the constraints, we can create a new problem free from our common sense. Presenting funny but essential problems could be another aspect for active learning using the problem of the interdisciplinary scientific methods.
With 80 steps Galton boards, we can see the binomial distribution approximated to the normal distribution.
Youtube ===>>>
https://www.youtube.com/watch?v=3w4e1RQTAB8
The cumulative exposure model (CEM) is often used to express the failure probability model in the step-up test method; the step-up procedure continues until a breakdown occurs. This probability model is widely accepted in reliability fields because accumulation of fatigue is considered to be reasonable. Contrary to this, the memoryless model (MM) is also used in electrical engineering because accumulation of fatigue is not observed in some cases. We propose here a new model, the extended cumulative exposure model (ECEM), which includes features of both the described models. A simulation study and an application to the actual experimental case of oil insulation test support the validity of the proposed model. The independence model (IM) is also discussed.
Parameter estimation for the truncated weibull model using the ordinary diffe...Hideo Hirose
In estimating the number of failures using the truncated data for the Weibull model, we often encounter a case that the estimate is smaller than the true one when we use the likelihood principle to conditional probability. In infectious disease predictions, the SIR model described by simultaneous ordinary differential equations are often used, and this model can predict the final stage condition, i.e., the total number of infected patients, well, even if the number of observed data is small. These two models have the same condition for the observed data: truncated to the right. Thus, we have investigated whether the number of failures in the Weibull model can be estimated accurately using the ordinary differential equation. The positive results to this conjecture are shown.