While we wait for the Age Of Apple Intelligence, it may be worth considering a recent Apple research study that exposes critical weaknesses in existing artificial intelligence models. Apple’s researchers wanted to figure out the extent to which LLMs such as GPT-4o, Llama, Phi, Gemma, or Mistral can actually engage in genuine logical reasoning to reach their conclusions/make their recommendations. The study shows that, despite the hype, LLMs (large language models) don’t really perform logical reasoning — they simply reproduce the reasoning steps they learn from their training data.
Source: ComputerWorld