ARC-AGI Without Pretraining: Achieving State-of-the-Art Performance

In this study on ARC-AGI (Artificial Recursive Cortex – Artificial General Intelligence) without pretraining, Isaac Liao and Albert Gu introduce CompressARC as a novel approach. They leverage the concept of Relative Entropy Coding (REC), assuming its existence despite practical limitations such as exponential runtime complexity and approximate decoding.

CompressARC utilizes layers that compress information while preserving essential details for solving complex tasks like image reconstruction from partial data or puzzle assembly from scrambled pieces. Each layer consists of several steps:
1. Tensor decomposition into mean, variance, and residual components using REC-like techniques. This step reduces the complexity by approximating high dimensional tensors to lower dimensions while preserving information content up to a target capacity parameterized exponentially with respect to tensor size.
2. Normalization of input data through elementwise division by learned means and variances, ensuring zero mean and unit variance across non-channel dimensions (except for the ‘example’ dimension). This step stabilizes learning during training while maintaining information content.
3. Postprocessing involving signal-to-noise ratio calculations to control how much information is allowed to leak between layers while preserving essential details required for solving tasks. The postprocessed output tensors are then combined using a linear map and bias, followed by an angle-dependent operation that averages multiple slices of the same shape resulting in different correct puzzle solutions.
4. Application of logsumexp function on probabilities associated with these slices to account for multiple possible solutions while preventing premature convergence onto one particular slice during training. The masks are adjusted accordingly based on this coefficient schedule, ensuring they don’t dominate early learning stages.
5. Combining input and output masks before using them in subsequent operations where $channel$ dimensions may not align perfectly between inputs and outputs due to different grid shapes involved in various steps of the model architecture.

The authors emphasize that their approach is theoretical at this stage, as many practical issues remain unresolved regarding REC implementation details or runtime efficiency concerns. However, it serves as a foundation for further exploration into ARC-AGI architectures without requiring extensive pretraining data sets.

Complete Article after the Jump: Here!