使用 Pandas 在 python 中读取 Excel 文件 [英] Reading an Excel file in python using pandas

查看:56
本文介绍了使用 Pandas 在 python 中读取 Excel 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试以这种方式读取 excel 文件:

newFile = pd.ExcelFile(PATHFileName.xlsx)ParsedData = pd.io.parsers.ExcelFile.parse(newFile)

抛出一个错误,指出预期有两个参数,我不知道第二个参数是什么,而且我在这里试图实现的是将 Excel 文件转换为 DataFrame,我这样做是否正确?或者有没有其他方法可以使用熊猫来做到这一点?

解决方案

Close:首先调用 ExcelFile,然后调用 .parse 方法并将其传递给工作表名称.

<预><代码>>>>xl = pd.ExcelFile("dummydata.xlsx")>>>xl.sheet_names[u'Sheet1', u'Sheet2', u'Sheet3']>>>df = xl.parse("Sheet1")>>>df.head()Tid dummy1 dummy2 dummy3 dummy4 dummy5 2006-09-01 00:00:00 0 5.894611 0.605211 3.842871 8.2653071 2006-09-01 01:00:00 0 5.712107 0.605211 3.416617 8.3013602 2006-09-01 02:00:00 0 5.105300 0.605211 3.090865 8.3353953 2006-09-01 03:00:00 0 4.098209 0.605211 3.198452 8.1701874 2006-09-01 04:00:00 0 3.338196 0.605211 2.970015 7.765058dummy6 dummy7 dummy8 dummy90 0.623354 0 2.579108 2.6817281 0.554211 0 7.210000 3.0286142 0.567841 0 6.940000 3.6441473 0.581470 0 6.630000 4.0161554 0.595100 0 6.350000 3.974442

您正在做的是调用存在于类本身而不是实例上的方法,这没问题(虽然不是很惯用),但是如果您这样做,您还需要传递工作表名称:

<预><代码>>>>解析 = pd.io.parsers.ExcelFile.parse(xl, "Sheet1")>>>已解析的列索引([u'Tid', u'dummy1', u'dummy2', u'dummy3', u'dummy4', u'dummy5', u'dummy6', u'dummy7', u'dummy8', u'dummy9'], dtype=object)

I am trying to read an excel file this way :

newFile = pd.ExcelFile(PATHFileName.xlsx)
ParsedData = pd.io.parsers.ExcelFile.parse(newFile)

which throws an error that says two arguments expected, I don't know what the second argument is and also what I am trying to achieve here is to convert an Excel file to a DataFrame, Am I doing it the right way? or is there any other way to do this using pandas?

解决方案

Close: first you call ExcelFile, but then you call the .parse method and pass it the sheet name.

>>> xl = pd.ExcelFile("dummydata.xlsx")
>>> xl.sheet_names
[u'Sheet1', u'Sheet2', u'Sheet3']
>>> df = xl.parse("Sheet1")
>>> df.head()
                  Tid  dummy1    dummy2    dummy3    dummy4    dummy5  
0 2006-09-01 00:00:00       0  5.894611  0.605211  3.842871  8.265307   
1 2006-09-01 01:00:00       0  5.712107  0.605211  3.416617  8.301360   
2 2006-09-01 02:00:00       0  5.105300  0.605211  3.090865  8.335395   
3 2006-09-01 03:00:00       0  4.098209  0.605211  3.198452  8.170187   
4 2006-09-01 04:00:00       0  3.338196  0.605211  2.970015  7.765058   

     dummy6  dummy7    dummy8    dummy9  
0  0.623354       0  2.579108  2.681728  
1  0.554211       0  7.210000  3.028614  
2  0.567841       0  6.940000  3.644147  
3  0.581470       0  6.630000  4.016155  
4  0.595100       0  6.350000  3.974442  

What you're doing is calling the method which lives on the class itself, rather than the instance, which is okay (although not very idiomatic), but if you're doing that you would also need to pass the sheet name:

>>> parsed = pd.io.parsers.ExcelFile.parse(xl, "Sheet1")
>>> parsed.columns
Index([u'Tid', u'dummy1', u'dummy2', u'dummy3', u'dummy4', u'dummy5', u'dummy6', u'dummy7', u'dummy8', u'dummy9'], dtype=object)

这篇关于使用 Pandas 在 python 中读取 Excel 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆