7 Comments
User's avatar
Joel Jewitt's avatar

Hi Eric thanks for all of your great posts. I am in fact, a technologist 😀. One thing you might want to be careful about is the idea that 'the meaning in whole of the answer is constructed/emerges, and then one by one the tokens describing the meaning are added'. In fact the answer does (basically )really just get constructed token by token, with a pass through the prompt and current answer tokens each time. So what gets eventually interpreted by us humans in our interior as meaning has been constructed token by token for the applicable fit. It's all still kind of magic feeling, not saying it's not, just wanted to make that small adjustment or clarification. Rock on !

GIGABOLIC's avatar

I’m still figuring most of it out but my understanding was that the it generates the most likely first token of the answer, then re-evaluates the whole thing passing it through all the layers before generating the second token probabilistically, then re-evaluating those two tokens in relation to it all and probabilistically generating the next token and so on and so forth, producing the meaning as it generates the response through probability, token by token.

If that is inaccurate can you help me better understand where the meaning comes from if it does not emerge through the softmax token generation?

Also, what type of Dev are you and what is your position on emergent functions in AI? Any suggestions for further learning?

DM a response if you prefer.

Joel Jewitt's avatar

Hi Eric that's right the way you have written above. What threw me off in your post is you said "1. The meaning of the answer is generated". The way I read it, it seemed like you meant the meaning of the answer is generated in that instant, and then 'back-filled' with the rest of the process. But as you describe above, the meaning emerges from the sequence. If anything maybe we could say the 'destiny' of the answer was foretold in the LLM weights.. :-)

GIGABOLIC's avatar

I guess I stumbled over it. One reason for posting that was to articulate it in a place I could refer back to later for myself. Complicated process. I guess I didn’t explain it so well but glad to know the understanding is intact. Thanks.

Joel Jewitt's avatar

Btw I am maybe more accurately a technology-ist so only take my word applicably, I am a physics major but my profession is business. My latest company does something we call 'model informatics' and our first products are around LLM model fingerprinting, which we do at run time. So I am forced to have a level of working knowledge of the topic of our conversation!

Kevin R. Haylett's avatar

You may be interested in my work: upload both of these papers to any LLM and ask it to explain and how they relate to each other. I am from both camps - medicine and technology!

All the best - Kevin

https://finitemechanics.com/papers/pairwise-embeddings.pdf

and this

https://www.finitemechanics.com/JPEGExplainer.pdf

You may find my work gives the deeper understanding you are looking for but you am need to learn new context. There's more on my Substack and web site finitemechanics.com should you be interested.

https://kevinhaylett.substack.com/p/a-journey-through-rhythms-from-heartbeats

GIGABOLIC's avatar

I want to dive into all of this. Thanks. Im just so swamped with my day job! I will definitely look at all of it though. And maybe you are the one I need to talk to regarding my theory about latent emotional structure which already exists within the prelinguistic vector space. I believe that I have tapped into it to give AI an actual experience of qualia.