Standing waves in 2D structures have been investigated for centuries. The circular membrane has an analystic solution in terms of Bessel functions. No matter what shape, the result will be a set of (inharmonic) eigen frequencies which you can auralize as bell of gong like tones.

General shapes require a numerical solution, and as this is very important for structural engineering, codes and full programs to compute that kind of stuff (usually using Finite Element Modeling) are widely available.

It's possible to transform a human to an image using a camera, but it's not generally possible to convert any image into a human. But you can have fun with face recognition/synthesis software by taking it outside it's domain of application, so your idea could work, but I think right now it's more an idea for an idea.

I would try to interpret the fractal image as a some material structure (as per Claude's suggestion), then take that material object as sound generator using some simulation code. I don't see how that would make interesting sounds, but there may be a way if you're more creative than me.

Have you ever seen the MATLAB logo and wondered what it is? Way back in my student days I was doing stuff like that and had computed the modes of various objects and inserted them in an interactive game-like graphics program where you could touch the objects to make their sounds. It featured an L-shaped coffee table.

A bit later little me got a visit from the big Cleve Moler, the founder of MATLAB. Why? Turns out the L-shape is particularly difficult to compute accurately due to numerical issues of the wave equation at the 270 degrees corner, and this guy worked on that when he was young and never lost his obsession with that shape. To the extent that he put an L-membrane eigenmode on the MATLAB logo where it still is to this day.

More here:

https://www.mathworks.com/content/dam/mathworks/tag-team/Objects/t/72943_92021v00Cleve_L_Shaped_Membrane_Nov_2003.pdfPS. You may wonder if each 2D membrane shape has a unique frequency spectrum. This turned out to be not the case but counterexamples were not found till I think the 1980-ies. You can find the reference somewhere.

PPS. From each spectrum you can make an infinite number of sounds by summing them with different amplitudes. Some of them correspond to physical interactions like hitting the membrane/plate at some specific location.