The "Gaza Armory" and How AI Generates…

Ryan McBeth

Nov 28, 2023

269

AI Can't Generate Fingers but Humans Can Still Use Our Heads

Listen →

38 Comments

Evan

Nov 28, 2023

Looks like a Wes Anderson armory :)

Expand full comment

Reply (2)

Rob Kristjansson

Nov 28, 2023

I want to see that parody short film.

Expand full comment

TODD

Dec 13, 2023

Clever

Expand full comment

John Polcari

Nov 28, 2023

There's no pathway to the back. The entire floor is covered.

Expand full comment

Bob Young

Nov 28, 2023

Even without the AI errors in convolution, there are several details that point to this picture not being of a working armory.

1. No foot path to get to the stuff in the back.

2. Nothing is stacked. With such limited space, I would have has racks or shelves in the middle to optimize the empty "air" space.

3. Assuming there are "hidden" pegs that the guns, etc, are resting on upon the walls, it all seems custom, each item with a different length and fitted near to each other like a puzzle.

In real life, it should be just columns of identically sized rifles.

It looks more like a custom, bespoke gun room than an armory.

Expand full comment

Reply (1)

Akushev Dmitry

Nov 28, 2023

There's another:

4. Hamas has a very distinct tunnel structure system, as they exclusively use concrete to reinforce it, limiting them in shape and structure. This ain't it.

Expand full comment

SquireZed

Nov 28, 2023

I think there's some fundamentally inaccurate understandings of how image diffusion works in Cody's segment- while it is a way to portray things, his definition and perspective on local convolution is not inherently accurate. So, for example, I generated an image with Midjourney and the video result more accurately describes the way diffusion models work. <https://cdn.midjourney.com/9546fb4c-328b-4564-b819-f72d55232bca/video.mp4> (It's cat pictures, no jump scares here).

AI doesn't necessarily sequentialize the way that Cody suggests. Diffusion models do work backwards from noise, but they actually consider the whole image holistically while they do so. The problems of local convolutions is real, and it's accurate that there are local convolutions in the falsely presented image, but it isn't because the AI is sequentially working on part of the image and then just "forgets" and starts over as it continues to work. It's more that when envisioning a concept it has a holistic concept, but tries to represent that concept in a way which is consistent with how it understands the concept being generated. As it cleans away the noise to generate a specific image, it struggles to "fixate" on a single concept of a hand and instead manifests multiple different possibilities for a hand.

I'm not saying that Cody's take isn't useful, I think it's a very useful baseline understanding for laypeople approaching AI because it's closer to how we think. But it also risks some fundamental limitations as AI models improve- for example, if the model had focused on a single rifle center of frame, it probably would have come much closer to a "real" looking image without local convolutions. Local convolutions are driven by how the model "visualizes" images, but it's also largely a product of factors like uncertain possibility space and concept latitude. This is also one reason why a hand made by AI looks fake, even when it does get the right number of fingers- there are many possible arrangements of fingers based on the gestures being made, polite or not, and the possibility space is very broad.

Notably, one important detail that is useful for identifying AI images is that existing AI tends to perform better when working with more resolution- background details are more likely to reveal aberrations than close foreground details. When trying to identify an AI generate image, deformed faces or features in the background of images is a telltale sign- and one which is present abundantly in this image. The "scissors" being distorted is as much or more of a tell than the firearms in the image, since it is something which even someone not familiar with firearms and munitions should be familiar with.

There are many, many ways to recognize AI generated DIP and misinformation, and Cody only scratches the surface in one of the least reliable ways to tell if an image is AI generated, as it relies on a specific failure state which is plausibly fixable even with current models using something like a LORA to fine tune for specific firearm types. While a convincingly realistic image of this style is likely beyond existing AI models, a more "focused" concept could easily generate a convincing firearm in a convincing environment with enough training on similar photographs. As you identified in the video, Hamas uses particular types of AKs with folding stocks. It would be possible to potentially do an embedding of images of Hamas-specific weaponry and basically fine tune a model to generate appropriate types of rifles that are of sufficient quality to fool people illiterate on the topic.

However, it would be impossible (with current AI technology, obviously this may change as technology develops) to fix the local coherence issues endemic to fine details in AI generated images at the moment. That means that something like scissor handles, which should be readily identifiable to most people and aren't going to have regional differences, "frankengun" designs to murky the discourse, or any other fringe justifications are actually a viable and possibly even better way to discredit AI generated DIP and misinformation than focusing on the "smoking gun" which could fool people with no subject matter expertise. I remember a famous post by a journalist who saw an earplug on the ground at a protest and asked if it was a rubber bullet. People who are illiterate on subject matter will believe things that are beyond their scope of understanding because they don't know how to tell the difference, but no one is going to think those scissors look real.

Expand full comment

Reply (1)

Peter Gerdes

Nov 28, 2023Edited

I appreciate your explanation and visualization since I think that helps communicate what's going on a bit more clearly (no slight on the video -- it's always hard to explain these things without actually going into the math but you definitely added useful clarification).

However, I don't think there was any claim in the video that one can rely on looking for local errors as a means to detect AI images. It was merely presented as a way in which current AI image generation can fail with an interesting background as to how it happens. Indeed, the message I took away from the video was just: look carefully and here is one thing you can look for.

Indeed, relying on any specific problem to appear every time something is a fake is always going to be a problem since the people producing the fake can just use a different method or screen against that test. Even using current generation AI you can produce an image in much higher res, downres the result and then double check it doesn't have any obvious giveaways and rerun if it does -- or just go into Photoshop and manually fix those issues.

Expand full comment

Doug Lucas

Nov 28, 2023

Lower right corner, green box. It looks like the handle is at a corner of the lid rather than its center.

Expand full comment

Ray

Nov 28, 2023

Israel gave Hamas three weeks notice for Hamas to remove the weapons from the northern part of Gaza.

Expand full comment

Sven

Nov 28, 2023

I would hope that they wouldn't store surgical instruments on a dirt floor.

Expand full comment

P Username

Nov 28, 2023

Only a Tier 1 Idiot would *not* see that as an AI image. AKs with a barrel instead of a stock, anyone? AR butts on AKs? Please. It just screams "fake".

Expand full comment

Reply (2)

SquireZed

Nov 28, 2023

Unfortunately, there are a lot of Tier 1 Idiots in the world.

Expand full comment

Bob C

Dec 29, 2023

Someone not knowing what AR's or AK's are, much less being being able to recognize differences between them, doesn't make them idiots. It just means they don't have intimate knowledge of firearms. Steady up, man!

Expand full comment

Reply (1)

P Username

Dec 29, 2023

No, we should call out people who express an opinion when they have no knowledge. Be the term idiot, moron, simpleton, dumb ass, or any other derogatory term; use it liberally and often against these smooth-brains when they try to convince others that they know what they are talking about. Being silent on their horseshit is the same as passive acquiescence of their correctness.

Expand full comment

Jon D

Nov 28, 2023

WTF was that a rollerblading infantry unit?!?

Expand full comment

Reply (1)

Gazoo

Nov 28, 2023

SEALS in the water

Rangers in the mountains

And, now, the Xanadu squad, to counter all those rollerblade arena attacks...

Expand full comment

רביד ידידיה

Nov 28, 2023

Another thing about this photo - hamas tunnels don't look like that, hamas tunnel have a pretty distinctive look, they reinforce the walls and ceiling with cement to prevent collapses, the ground in Gaza is mostly sand,.

Also why would they made the room round? We saw the video from underneath al shifa hospital, they have normal rooms there.

Expand full comment

Matt Osborne

Nov 28, 2023

Gentlefolk, we have been 4chan'd

Expand full comment

Bryan

Nov 28, 2023

Could you please post in the description whether or not the Substack video contains footage or information that isn't in the YouTube version? That way people can know where to watch it- YT clicks pay out for you, Substack's already paying if I click or not.

Expand full comment

Brian Foster

Nov 28, 2023

If one looks closely at the man with 6 fingers, it's even weirder. There are 5 fingers with the index finger branching off into two, like the letter Y.

Expand full comment

Michael Hardy

Nov 28, 2023

I see you found the unlocked achievement of the suicide/homicide rifle no magazines needed.

Expand full comment

Adrian Neill

Nov 28, 2023Edited

I asked Chat GPT 4 to look for inconsistencies in a screenshot of the image, it said:

Upon inspecting the image, here are some inconsistencies noted along with their approximate pixel locations:

Unusual shadowing on the bottom left weapon, which could indicate an error in light source rendering by the AI — Location: (50, 1030).

Blurry area around the middle part of the table, suggesting a potential issue with the AI's focus or texture blending — Location: (960, 540).

Mismatched texture on the top right corner, possibly due to the AI struggling with maintaining consistency in the pattern or material — Location: (1870, 50).

END OF AI MESSAGE

Stretch the image to a 1080p screen to get matching pixel locations.

Expand full comment

Peter Gerdes

Nov 28, 2023

Great video, but this isn't a new threat. It's just a return to the situation we've been in throughout most of human history -- you have to rely on trustworthy sources to figure out what's true.

Indeed, I suspect that technology will put us in a better position than we've faced in the past. It's been true for ages that powerful people could manipulate photos but now that everyone can I hope to see camera hardware makers (at least for journalists) including cryptographic modules that add digital signatures to the images they take. It will take some work to get right and in the meantime we're basically back in the era before photography was easy and everywhere but in the long run technology can make it harder than it ever was before to fake evidence.

Expand full comment