QUOTE(InterestedHacker @ Dec 13 2005, 12:04 AM)

Is it possible to approach the problem from another angle, say for example, if we get the public key (no easy task) first
It's certainly not trivial, but it isn't that difficult either. The public key is just that, public. No effort needs to be made to hide it.
QUOTE
then work out how to read the file, extract / check it's hash and sig. Then, if we find a really really tiny file, and write some code that could try and produce a signature to match what the original was?
Nope. Message digest algorithms produce digests of the same length for
any message (up to the maximum size the algorithm can handle) so a signature for a 5-byte file will be indistinguishable from the signature of a 5-terabyte file.
QUOTE
I know it's a brute force hack again, but if the file is small enough, wouldn't there be more chance of this type of attack working?
Nope. You can try signing every possible file up to, say, 20 bytes (which would take a very long time; any higher and it becomes as computationally infeasible as brute-forcing RSA) and the chances of a collision are negligible.
QUOTE
What about if we could compare the signatures of two or more files, is there any way we could use the differences between signatures, compared to say CRC32 signature, or some other point if reference?
Nope. Message digest algorithms are designed to be secure against such attacks. Here's an example, from Wikipedia:
SHA1("The quick brown fox jumps over the lazy dog") ==
"2fd4e1c67a2d28fced849ee1bb76e7391b93eb12"
SHA1("The quick brown fox jumps over the lazy cog") ==
"de9f2c7fd25e1b3afad3e85a0bd17d9b100db4b3"
Two different strings. Same length. Differ by only one
bit. But the digests are completely different. No similarity at all. If you can deduce anything meaningful by comparing those two digests, you're a better man than I.
Now, this is not to say that existing message digest algorithms will never be broken. I believe there's already a practical way of finding collisions for MD5, and SHA-1 will probably be there in the next decade. That's why NIST has announced plans to decertify SHA-1 in favor of the multiple flavors of SHA-2 by 2010. Then we'll be back into huge multiples of the age of the universe territory.
Edit: I'm a bit behind the times. Researchers have devised an algorithm that can generate SHA-1 collisions in about 2^69 operations, which is at the very limit of practicability... with a big cluster of heavy iron. Even so, the would-be cracker has a problem. Just because he can generate (after several hundred CPU-years) a hash collision doesn't mean that the collision will be
useful for anything. To put it in the context of the Xbox, it doesn't do you any good to find a file that you can affix an existing signature to if the file consists of gibberish that would cause the console to immediately crash anyway.