This paper presents research projects tackling two aspects in data mining. First, a toolbox is discussed that allows flexible and interactive data exploration, analysis and presentation using the scripting language Python. The advantages of this toolbox are that it provides the functionality to process multiple SQL queries in parallel, and enables fast data retrieval using a supervised caching mechanism for commonly used queries. These two facets of the toolbox allow for fast, efficient data access reducing the time spent on data exploration, preparation and analysis. Secondly, an approach to predictive modelling is presented that leads to scalable parallel algorithms for high dimensional data collections. This is an essential requirement for data mining algorithms as those that do not scale linearly with the data size are infeasible. These algorithms are implemented in parallel and achieve an almost ideal speedup for their respective implementations. One aim of the presented research is to integrate and combine these two different aspects of data mining into an efficient but flexible data mining toolbox that allows the experienced data miner to attack large scale problems interactively or with batch processing.