TypeError:一元错误的操作数类型〜:"float" [英] TypeError: bad operand type for unary ~: 'float'
问题描述
我需要处理一个看起来很奇怪的数据框.看起来像这样:
I have a weird looking dataframe that I need to wrangle. It looks something like this:
Unnamed: 0 REFERENCE_CODE ... Unnamed: 12 Unnamed: 13
0 Q2 country_satis ... NaN NaN
1 NaN 1 ... NaN NaN
2 NaN 2 ... NaN NaN
3 NaN 8 ... NaN NaN
4 NaN 9 ... NaN NaN
5 NaN NaN ... NaN NaN
6 Q3 econ_sit ... NaN NaN
5 NaN NaN ... NaN NaN
7 NaN 1 ... NaN NaN
8 NaN 2 ... NaN NaN
9 NaN 3 ... NaN
10 NaN 4 ... NaN NaN
11 NaN 8 ... NaN NaN
12 NaN 9 ... NaN NaN
13 NaN NaN ... NaN NaN
14 Q4 children_betteroff2 ... NaN Не четете!
15 NaN 1 ... NaN NaN
16 NaN 2 ... NaN NaN
15 NaN NaN ... NaN NaN
18 NaN 8 ... NaN NaN
19 NaN 9 ... NaN NaN
20 NaN NaN ... NaN NaN
21 Q5 satisfied_democracy ... NaN NaN
22 NaN 1 ... NaN NaN
23 NaN 2 ... NaN NaN
24 NaN 3 ... NaN NaN
(我在这里对原始内容进行了一些编辑,以反映在此非常长的数据框中可能出现的内容).我的目标是为与问题(例如country_statis)相关的每个值(例如1,2,8,9)产生唯一的ID.我试图将country_satis串联为1,以便所有块"都具有
(I made some edits to the original here in order to reflect what may appear in this very long dataframe). My goal here is to produce a unique ID for each of the values (ex. 1,2,8,9) associated to a question (ex. country_statis). I am attempting to concatenate country_satis to 1, so that all of my "blocks" have
0 Q2 country_satis ... NaN NaN
1 NaN country_statis_1 ... NaN NaN
2 NaN country_statis_2 ... NaN NaN
3 NaN country_statis_8 ... NaN NaN
4 NaN country_statis_9 ... NaN NaN
5 NaN NaN ... NaN NaN
这是我的尝试:
df.REFERENCE_CODE = df.REFERENCE_CODE.fillna('')
df.REFERENCE_CODE.str.isnumeric().dtype # returns object
headers = (df.REFERENCE_CODE != '') & ~df.REFERENCE_CODE.str.isnumeric()
res = df.groupby(headers.cumsum())['REFERENCE_CODE'].apply(lambda x: x.iloc[0] + '_' + x)
df.REFERENCE_CODE.update(res[df.REFERENCE_CODE.str.isnumeric()])
我在这里的目标也是保持数据的完整性和结构,因为理想情况下,我最终希望对两个数据源进行干净的合并.我可能应该在SQL大声笑中做到这一点.
My goal here is also to keep the integrity and structure of the data, because eventually, ideally, I'd like to perform a clean merge of 2 data sources. I should probably do this in SQL lol.
此处错误:
Traceback (most recent call last):
File "/Users/xx/Projects/trend_env/src/script4.py", line 10, in <module>
df.REFERENCE_CODE = df.REFERENCE_CODE.fillna('')
File "/Users/xx/Projects/trend_env/lib/python3.7/site-packages/pandas/core/generic.py", line 5067, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'REFERENCE_CODE'
很抱歉,我发布了错误的脚本错误.这是错误消息:
I'm so sorry, I posted the wrong script error.. here is the error message:
Traceback (most recent call last):
File "/Users/xxx/Projects/trend_env/src/script4.py", line 16, in <module>
headers = (df.REFERENCE_CODE != '') & ~df.REFERENCE_CODE.str.isnumeric()
File "/Users/xxx/Projects/trend_env/lib/python3.7/site-packages/pandas/core/generic.py", line 1466, in __invert__
Index(['Question number', 'REFERENCE_CODE', 'Filter', 'English stem',
'Translator note', 'Philippines - Bicolano', 'Philippines - Cebuano',
'Philippines - Ilonggo', 'Philippines Ilokano', 'Philippines - Tagalog',
'Unnamed: 10', 'Unnamed: 11', 'Unnamed: 12', 'Unnamed: 13'],
dtype='object')
arr = operator.inv(com.values_from_object(self))
TypeError: bad operand type for unary ~: 'float'
根据安迪·海登(Andy Hayden)的说法-您介意帮助我解决此逻辑吗?我的代码工作得很好.我有一个df看起来像这样的情况:
As per Andy Hayden -- do you mind helping me solve this logic.. I have the code working just fine. I have a case where the df looks like this:
25 partyfav_batt NaN
26 partyfav_bulgaria_GERB NaN
27 partyfav_bulgaria_BSP NaN
28 partyfav_bulgaria_DPS NaN
29 NaN
30 partyfav_bulgaria_DPS_1 NaN
31 partyfav_bulgaria_DPS_2 NaN
32 partyfav_bulgaria_DPS_3 NaN
33 partyfav_bulgaria_DPS_4 NaN
34 partyfav_bulgaria_DPS_8 NaN
35 partyfav_bulgaria_DPS_9 NaN
36 NaN
37 partyfav_batt NaN
38 partyfav_canada_Lib NaN
39 partyfav_canada_Cons NaN
40 partyfav_canada_NDP NaN
41 NaN
42 partyfav_canada_NDP_1 NaN
43 partyfav_canada_NDP_2 NaN
44 partyfav_canada_NDP_3 NaN
45 partyfav_canada_NDP_4 NaN
46 partyfav_canada_NDP_8 NaN
47 partyfav_canada_NDP_9 NaN
如何获取它,以便在看到大块的情况下...
How can I get it, so that if it sees a chunk...
37 partyfav_batt NaN
38 partyfav_canada_Lib NaN
39 partyfav_canada_Cons NaN
40 partyfav_canada_NDP NaN
它变成了这样的东西(我已经凝结了):
It turns into something like this (I have condensed it):
39 partyfav_canada_Cons NaN
40 partyfav_canada_NDP NaN
41 NaN
42 partyfav_canada_Cons_1 NaN
43 partyfav_canada_Cons_2 NaN
44 partyfav_canada_Cons_3 NaN
45 partyfav_canada_Cons_4 NaN
42 partyfav_canada_NDP_1 NaN
43 partyfav_canada_NDP_2 NaN
44 partyfav_canada_NDP_3 NaN
45 partyfav_canada_NDP_4 NaN
推荐答案
您可以先fillna
:
~df.REFERENCE_CODE.fillna('').str.isnumeric()
示例:
Example:
In [11]: s = pd.Series(['1', np.nan, 'c'])
In [12]: s
Out[12]:
0 1
1 NaN
2 c
dtype: object
In [13]: s.str.isnumeric()
Out[13]:
0 True
1 NaN
2 False
dtype: object
In [14]: ~s.str.isnumeric()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-14-2e51f8bd1622> in <module>()
----> 1 ~s.str.isnumeric()
~/.miniconda3/lib/python3.7/site-packages/pandas/core/generic.py in __invert__(self)
1141 def __invert__(self):
1142 try:
-> 1143 arr = operator.inv(com._values_from_object(self))
1144 return self.__array_wrap__(arr)
1145 except Exception:
TypeError: bad operand type for unary ~: 'float'
In [15]: ~s.fillna('').str.isnumeric()
Out[15]:
0 False
1 True
2 True
dtype: bool
这篇关于TypeError:一元错误的操作数类型〜:"float"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!