# Thoughts on AI
Created: 2023_01_02 21:02
Tags: [[AI]] [[Technology]]
ChatGPT came out recently, and it has me thinking what my thoughts are around AI in the future. ChatGPT is not an AI, given a textual input it gives a textual output where the text is understood under grammatical rules that it itself does not understand but can operate upon. It just understands, based on probability, what would be the next word. DALL-E operates on textual descriptions of images.
Text & Images are a computers bread & butter. There are large sample sets of each to pull from.
Audio & Video are difficult for a different reason, time. Not only are they large amounts of data to be operating on, it must be huge samples to discern elements (which would have to be tagged in some form to be useful to us e.g. knowing the drums from a song vs the vocals)
![[AI-Elements.excalidraw.svg]]
How do you interface with these models?
Currently with a text model it is somewhat natural, we are already accustomed to inputting text into a computer via keypresses that are symbols that are recognizable to us and mimic our written language. Images are a bit more complex in that they are captured by a camera and can then be operated on since it is already in a digital format. But how do you describe the intent with how you want something to be manipulated? In human speech there is already tons of ambiguity due to how much we rely on surrounding context. But to give sensors to a machine is difficult & expensive because for each input it receives it must parse & tease out the intent of t's surrounding environments.
## References
-