Efficient traffic engineering of IP networks requires the knowledge of the main characteristics of the supported traffic. Several studies have shown that IP network traffic may exhibit properties of burstiness, self-similarity and/or long-range dependence, with significant impact on network performance. In this work, we propose a Markov Modulated Poisson Process (MMPP), and its associated parameter fitting procedure, that is able to incorporate these characteristics over multiple time scales. This is accomplished through a hierarchical construction procedure that, starting from a MMPP that matches the distribution of packet counts at the coarsest time scale, successively decomposes each MMPP state into new MMPPs, that incorporate a more detailed description of the distribution at finner time scales. The traffic process is then represented by a MMPP equivalent to the constructed hierarchical structure. The accuracy of the fitting procedure is evaluated by comparing the Hurst parameter, the probability mass function at each time scale and the queuing behavior (as assessed by the loss probability and average waiting time), corresponding to the measured and to synthetic traces generated from the inferred models. Several measured traffic traces exhibiting self-similar behavior are considered: the well-known pOct Bellcore trace, a trace of aggregated IP WAN traffic, and a trace corresponding to the the popular file sharing application Kazaa. Our results show that the proposed model and parameter fitting procedure are very effective in matching the main characteristics of the measured traces over the different time scales present in data.