I suspect you may be better off asking on a music, comms or engineering forum. It sounds hard to get right (though perhaps not difficult to get something that works as a prototype) to me. A quick search turns up page upon page of people saying "it's hard":
audio - Note onset detection - Stack Overflow
java - Graphing the pitch (frequency) of a sound - Stack Overflow
FFT is what - as far as I know - everybody uses for this. There could be a magical other option, but if you're on new programming ground I would go with 'the way everybody does it': there'll be more examples and more help available. Good luck!