๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๐Ÿ ํŒŒ์ด์ฌ

[Pandas] ๋ฐ์ดํ„ฐ ๋ณ€ํ˜•ํ•˜๊ธฐ, GroupBy, ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ํ•ฉ์น˜๊ธฐ

by ์œ ์Šค :) 2023. 7. 11.
๋ฐ˜์‘ํ˜•

1. ๋ฐ์ดํ„ฐ ๋ณ€ํ˜•ํ•˜๊ธฐ

 

1) ์ •๊ทœํ‘œํ˜„์‹์œผ๋กœ ๋ฌธ์ž ๋ฐ์ดํ„ฐ ์ •๋ฆฌํ•˜๊ธฐ

# ์ •๊ทœํ‘œํ˜„์‹ - ^ ; Not์˜ ์˜๋ฏธ์™€ ๊ฐ™์Œ. ์ฆ‰ 0~9 a~z A~Z : , ๊ฐ€ ์•„๋‹Œ ์ด์™ธ์˜ ๋ฌธ์ž๋Š” ๋ชจ๋‘ ๋„์–ด์“ฐ๊ธฐ๋กœ ๋Œ€์ฒดํ•˜๊ฒ ๋‹ค๋Š” ์˜๋ฏธ.
df['Book-Title'] = [re.sub(r'[^0-9a-zA-Z:,]',  ' ',str(i)) for i in df['Book-Title']]
df['Main_Title'] = [i.split('  ')[0] for i in df['Book-Title']]
# Main Title ์ด ํ›„์˜ ์ œ๋ชฉ ๋‹จ์–ด๋“ค ๋‹ค ๊ฐ€์ ธ์˜ค๊ธฐ
df['Sub_Title'] = [''.join(i.split('  ')[1:]) for i in df['Book-Title']]

 

2) np.where ํ†ตํ•ด ๊ฐ’ ๋ณ€๊ฒฝํ•˜๊ธฐ

# Sub Title์— ๊ฐ’์ด ์—†์„ ๊ฒฝ์šฐ ํ•ด๋‹น ๊ฐ’์„ No SUB์œผ๋กœ ๋ณ€๊ฒฝ, ๊ฐ’์ด ์žˆ๋Š” ๊ฒฝ์šฐ๋Š” ๊ทธ๋Œ€๋กœ ๊ทธ ๊ฐ’ ์œ ์ง€
# np.where(๋ณ€๊ฒฝ์กฐ๊ฑด, ์ผ์น˜ํ•  ๊ฒฝ์šฐ, ์ผ์น˜ํ•˜์ง€ ์•Š์„ ๊ฒฝ์šฐ)
df['Sub_Title'] = np.where(df['Sub_Title'] == '', 'No_SUB', df['Sub_Title'])

 

๋ฐ˜์‘ํ˜•