Statistical analysis of basketball should ideally answer at least the following questions:
Who is the best player in the world?
Who is the best player on a team?
What is the overall best lineup for a particular team?
What type of lineup tends to hurt the team most?
Which players should be getting more playing time?
Which position is most in need of an upgrade?
How would a particular player impact this team?
How would an arbitrary team of five players perform?
How would a team perform if given more minutes?
How should a coach best maximize his team's chances?
What type of systems and plays would be best for this team?
What kind of players would be best for this system?
Is this player worth his salary?
How does player performance respond to changes in salary?
What contract incentives improve performance?
How will this talent translate onto this team or into this league?
Will the player be able to learn more and grow?
How long can this athlete be expected to remain healthy?
How quickly can this injured athlete be expected to recover?
What is the distribution of risk for this player?
Note that this is just a sample of questions that statistical analysis should be able to answer. There are three points to take away from this:
1) The variety of questions is enormous. It's not just about who the best
player is or what the best fantasy team would be. Those are only parts of the
overall goal of building a champion.
2) The tools required to answer each question differs enormously. Sometimes you need to know salary, and for that you need to know how to measure it accurately. It's not just dollars per year and it's not just the number of years. There's more to it.
3) The amount of data required is enormous. You need, at the very least, all of the box scores, play-by-plays, athlete measurements, historical salary information, medical histories, and front office transactional information. By my estimate for 10 years of such data, it would take up more than 30 gigabytes.