Berliner Boersenzeitung - Inner workings of AI an enigma

Berliner Boersenzeitung - Inner workings of AI an enigma - even to its creators

Berlin -7°C

EUR -

AED 4.324256

AFN 78.159711

ALL 96.383177

AMD 449.157005

ANG 2.108143

AOA 1079.738783

ARS 1707.874441

AUD 1.756

AWG 2.119738

AZN 2.000287

BAM 1.953036

BBD 2.371843

BDT 143.906326

BGN 1.955191

BHD 0.444171

BIF 3482.670891

BMD 1.177469

BND 1.51196

BOB 8.155423

BRL 6.501392

BSD 1.177633

BTN 105.803254

BWP 15.480025

BYN 3.437335

BYR 23078.382605

BZD 2.368438

CAD 1.610312

CDF 2590.430646

CHF 0.92851

CLF 0.027159

CLP 1065.420414

CNY 8.275838

CNH 8.252064

COP 4408.206571

CRC 588.167552

CUC 1.177469

CUP 31.202915

CVE 110.10916

CZK 24.255967

DJF 209.259427

DKK 7.469536

DOP 73.815527

DZD 152.411549

EGP 55.986858

ERN 17.662028

ETB 183.219906

FJD 2.671908

FKP 0.873156

GBP 0.872475

GEL 3.161506

GGP 0.873156

GHS 13.101402

GIP 0.873156

GMD 87.711644

GNF 10292.43287

GTQ 9.022231

GYD 246.37026

HKD 9.156248

HNL 31.041067

HRK 7.53285

HTG 154.191769

HUF 388.727076

IDR 19698.047161

ILS 3.7514

IMP 0.873156

INR 105.771583

IQD 1542.716556

IRR 49600.860368

ISK 147.999824

JEP 0.873156

JMD 187.84414

JOD 0.834831

JPY 183.703913

KES 151.834515

KGS 102.969389

KHR 4720.299202

KMF 492.181465

KPW 1059.742501

KRW 1700.794004

KWD 0.361706

KYD 0.981407

KZT 605.25337

LAK 25485.821075

LBP 105455.498466

LKR 364.544052

LRD 208.434113

LSL 19.599161

LTL 3.476759

LVL 0.712239

LYD 6.37298

MAD 10.744293

MDL 19.754956

MGA 5385.355108

MKD 61.564856

MMK 2472.482299

MNT 4186.078216

MOP 9.432809

MRU 46.632999

MUR 54.104315

MVR 18.191636

MWK 2042.001235

MXN 21.12342

MYR 4.762894

MZN 75.252358

NAD 19.599161

NGN 1707.85886

NIO 43.338662

NOK 11.782768

NPR 169.285406

NZD 2.01837

OMR 0.452732

PAB 1.177628

PEN 3.962692

PGK 5.085802

PHP 69.220433

PKR 329.881011

PLN 4.214724

PYG 7980.704715

QAR 4.292425

RON 5.092785

RSD 117.235839

RUB 93.019667

RWF 1715.165202

SAR 4.416325

SBD 9.600362

SCR 17.936872

SDG 708.250091

SEK 10.798899

SGD 1.512052

SHP 0.883406

SLE 28.34756

SLL 24690.93003

SOS 671.846267

SRD 45.138841

STD 24371.220655

STN 24.465374

SVC 10.304416

SYP 13019.126962

SZL 19.583283

THB 36.583811

TJS 10.822337

TMT 4.132914

TND 3.426051

TOP 2.835062

TRY 50.450053

TTD 8.010628

TWD 37.02232

TZS 2912.40591

UAH 49.679687

UGX 4250.98348

USD 1.177469

UYU 46.02486

UZS 14192.912426

VES 339.215528

VND 30990.970926

VUV 142.639174

WST 3.283513

XAF 655.027143

XAG 0.016365

XAU 0.000263

XCD 3.182168

XCG 2.122396

XDR 0.81366

XOF 655.02992

XPF 119.331742

YER 280.767332

ZAR 19.625454

ZMK 10598.631257

ZMW 26.584262

ZWL 379.144377

SCS

0.0200

16.14

+0.12%
NGG

0.2500

77.49

+0.32%
JRI

0.0600

13.47

+0.45%
BCC

1.4800

74.71

+1.98%
AZN

0.3100

92.45

+0.34%
BTI

0.2000

57.24

+0.35%
RIO

-0.0800

80.89

-0.1%
RYCEF

-0.0300

15.53

-0.19%
BCE

0.2800

23.01

+1.22%
GSK

0.1100

48.96

+0.22%
BP

-0.2700

34.31

-0.79%
RBGPF

0.0000

81.26

0%
CMSC

0.0100

23.02

+0.04%
CMSD

0.1200

23.14

+0.52%
RELX

-0.0400

41.09

-0.1%
VOD

0.0400

13.1

+0.31%

Inner workings of AI an enigma - even to its creators / Photo: Kirill KUDRYAVTSEV - AFP

Inner workings of AI an enigma - even to its creators

ECONOMY 13.05.2025

Even the greatest human minds building generative artificial intelligence that is poised to change the world admit they do not comprehend how digital minds think.

Text size:

"People outside the field are often surprised and alarmed to learn that we do not understand how our own AI creations work," Anthropic co-founder Dario Amodei wrote in an essay posted online in April.

"This lack of understanding is essentially unprecedented in the history of technology."

Unlike traditional software programs that follow pre-ordained paths of logic dictated by programmers, generative AI (gen AI) models are trained to find their own way to success once prompted.

In a recent podcast Chris Olah, who was part of ChatGPT-maker OpenAI before joining Anthropic, described gen AI as "scaffolding" on which circuits grow.

Olah is considered an authority in so-called mechanistic interpretability, a method of reverse engineering AI models to figure out how they work.

This science, born about a decade ago, seeks to determine exactly how AI gets from a query to an answer.

"Grasping the entirety of a large language model is an incredibly ambitious task," said Neel Nanda, a senior research scientist at the Google DeepMind AI lab.

It was "somewhat analogous to trying to fully understand the human brain," Nanda added to AFP, noting neuroscientists have yet to succeed on that front.

Delving into digital minds to understand their inner workings has gone from a little-known field just a few years ago to being a hot area of academic study.

"Students are very much attracted to it because they perceive the impact that it can have," said Boston University computer science professor Mark Crovella.

The area of study is also gaining traction due to its potential to make gen AI even more powerful, and because peering into digital brains can be intellectually exciting, the professor added.

- Keeping AI honest -

Mechanistic interpretability involves studying not just results served up by gen AI but scrutinizing calculations performed while the technology mulls queries, according to Crovella.

"You could look into the model...observe the computations that are being performed and try to understand those," the professor explained.

Startup Goodfire uses AI software capable of representing data in the form of reasoning steps to better understand gen AI processing and correct errors.

The tool is also intended to prevent gen AI models from being used maliciously or from deciding on their own to deceive humans about what they are up to.

"It does feel like a race against time to get there before we implement extremely intelligent AI models into the world with no understanding of how they work," said Goodfire chief executive Eric Ho.

In his essay, Amodei said recent progress has made him optimistic that the key to fully deciphering AI will be found within two years.

"I agree that by 2027, we could have interpretability that reliably detects model biases and harmful intentions," said Auburn University associate professor Anh Nguyen.

According to Boston University's Crovella, researchers can already access representations of every digital neuron in AI brains.

"Unlike the human brain, we actually have the equivalent of every neuron instrumented inside these models", the academic said. "Everything that happens inside the model is fully known to us. It's a question of discovering the right way to interrogate that."

Harnessing the inner workings of gen AI minds could clear the way for its adoption in areas where tiny errors can have dramatic consequences, like national security, Amodei said.

For Nanda, better understanding what gen AI is doing could also catapult human discoveries, much like DeepMind's chess-playing AI, AlphaZero, revealed entirely new chess moves that none of the grand masters had ever thought about.

Properly understood, a gen AI model with a stamp of reliability would grab competitive advantage in the market.

Such a breakthrough by a US company would also be a win for the nation in its technology rivalry with China.

"Powerful AI will shape humanity's destiny," Amodei wrote.

"We deserve to understand our own creations before they radically transform our economy, our lives, and our future."

(A.Berg--BBZ)

Berliner Boersenzeitung - Inner workings of AI an enigma - even to its creators

Inner workings of AI an enigma - even to its creators

Featured

Dow, S&P 500 end at records amid talk of Santa rally

Investors watching for Santa rally in thin pre-Christmas trade

Why metal prices are soaring to record highs

Stocks tepid in thin pre-Christmas trade