document updated 13 years ago, on Dec 2, 2011
What is the broad outlines of how data mining / quant finance / etc done?
wikipedia
- [[Analytics]]
- [[Predictive analytics]]
- [[Category:Data mining]]
- [[Category:Analytics]]
- [[Category:Data analysis]]
- [[Category:Financial data analysis]]
- [[Category:Exploratory data analysis]]
- [[Category:Computational statistics]]
- [[Category:Decision theory]]
dataset gathering
- [[Data scraping]]
- [[Text mining]]
- [[News analytics]] ([[sentiment analysis]], [[text analytics]])
- open datasets
- [[Linked data#Datasets]]
- [[Semantic Web#Projects]]
- [[Text corpus#Some notable text corpora]]
- [[Open data#Organisations promoting open data]]
- [[Category:Online databases]] (lots of non-open ones though)
- these should be under one parent category
- [[Category:Biological databases]]
- [[Category:Statistical data sets]]
- [[Category:Datasets in computer vision]]
- http://dir.w3.org/
- [[NASA World Wind#Datasets available]]
- specific places:
- indexes of specific places:
- opendata communities
- [[OpenStreetMap]]
- [[Toronto Open Data#Comparative Initiatives]]
Data cleanup / [[Data integration|integration]]
- [[Edge data integration]]
- [[Data virtualization]]
techniques
- [[Category:Computational statistics]]
- [[Bootstrap aggregating]]
- [[Association rule learning]]
quant-specific stuff
- [[Statistical arbitrage]]
- [[Volatility arbitrage]]
- [[Capital asset pricing model]]
lower level
- [[Structured data analysis (statistics)|Structured data]] vs [[Unstructured data]] vs [[Semi-structured data]]
- [[Latent variable]]
- [[Dimension reduction]], [[Curse of dimensionality]]
- [[Anscombe's quartet]]
- [[Overfitting]]
software
- [[Category:Free data analysis software]], [[Category:Free data visualization software]]
- [[Data warehouse]]
- [[Decision support system]]
- [[Unstructured Information Management Architecture]] (UIMA)
- [[Staa]]
- Exhibit, from the SIMILE project