CsvReader and Linq
CsvReader and Linq
In the previous post about My new Innovative project, I introduced a financial personal project that I will develop.
The first problem I want to solve is how to read CSV files, since all the data in
are stored in CSV's.
I develop simple generic CsvReader which helps me to read this data, before I view it in the graph.
The CSV file is looked something like this (I download S&P for the last 5 years from Yahoo):
The CSV file can be downloaded from here.
CsvReader Usage:
First define an interface IMarketEntity that the generic reader will fill automatically:
public interface IMarketEntity
{
DateTime Date { get; set; }
Double Open { get; set; }
Double High { get; set; }
Double Low { get; set; }
Double Close { get; set; }
Double Volume { get; set; }
Double AdjClose { get; set; }
}
The the usage of the CsvReader is very simple:
String fileName = "S_and_P_2000_2008_Daily.csv";
CsvReader<MarketEntity> reader = new CsvReader<MarketEntity>(fileName);
ICollection<MarketEntity> data = reader.Parse();
Class MarkerEntity is the implementation of interface IMarketEntity.
The CsvReader source code:
public class CsvReader<T> where T : new()
{
private String m_path;
public CsvReader(String path)
{
m_path = path;
}
public ICollection<T> Parse()
{
if (File.Exists(m_path) == true)
{
using (StreamReader reader = new StreamReader(m_path))
{
String str = reader.ReadToEnd();
Int32 idx = str.IndexOf("\n");
String header = str.Substring(0, idx - 1);
String[] headers = header.Split(new Char[] {','} );
List<PropertyInfo> properties = new List<PropertyInfo>();
foreach (String h in headers)
{
PropertyInfo propertyInfo = typeof(T).GetProperty(h.Replace(" ", ""));
properties.Add(propertyInfo);
}
String data = str.Substring(idx + 1);
return Parse(properties, data);
}
}
return null;
}
private ICollection<T> Parse(List<PropertyInfo> properties, String str)
{
ICollection<T> result = new List<T>();
String[] data = str.Split(new Char[] { '\n' });
foreach (String row in data)
{
if (String.IsNullOrEmpty(row))
{
break;
}
T item = new T();
String[] rowData = row.Substring(0, row.Length - 1).Split(new Char[] { ',' });
Int32 ii = 0;
foreach (PropertyInfo propertyInfo in properties)
{
Object obj = rowData[ii++];
switch (propertyInfo.PropertyType.ToString())
{
case "System.DateTime":
obj = DateTime.Parse((String)obj);
break;
case "System.Double":
obj = Double.Parse((String)obj);
break;
}
propertyInfo.SetValue(item, obj, null);
}
result.Add(item);
}
return result;
}
}
And now for the Linq part:
After the CsvReader return me a collection of MarketEntity, I can use Linq to make queries on this collection, here are some samples:
1) Get all the days with value between 1274 to 1416
var res1 =
from e in data
where (e.Low > 1375 && e.High < 1416)
select new { Low = e.Low, High = e.High };
2) Get the days with volume > 5700000000
var res2 =
from e in data
where e.Volume > 5700000000
select e;
3) Get the days with volume > 5700000000
IEnumerable<MarketEntity> res3 = data.Select(e => e).Where(e => e.Volume > 5700000000);
4) Get the days with volume > 5700000000 - Return only the volume
var res4 = data.Select(e => new { Volume = e.Volume }).Where(e => e.Volume > 5700000000);
5) Group by days high value / 100
var res5 =
from e in data
where e.High > 1500
group e.High by e.High / 100 into g
select new { High = g.Key, Numbers = g };
The source code can be founded here.