【数据准备和特征工程】1-2感知数据库中的数据

发布于:2022-07-26 ⋅ 阅读:(1275) ⋅ 点赞:(1)

pymysql读取数据

import pymysql
mydb = pymysql.connect(host="localhost",    # ①
                       user='root',
                       password='********',
                       db="books",
                      )
cursor = mydb.cursor()    # ②
sql = "select * from mybooks"    # ③
cursor.execute(sql)    # ④
datas = cursor.fetchall()    # ⑤
for data in datas:
    print(data)
(1, 'Learn Python', 'phei')
(2, 'Django', 'phei')
(3, 'Data Analysis', 'phei')
(4, 'Machine Learning', 'PHEI')
(5, '数据可视化案例', 'gitchat')
(6, '零基础入手Python', 'gitchat')

pandas读取数据

import pandas as pd

df = pd.read_sql_query(sql, con=mydb, index_col='id')
df
D:\CS\Apps\anaconda\anaconda3\envs\ai\lib\site-packages\pandas\io\sql.py:761: UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy
  warnings.warn(
name author
id
1 Learn Python phei
2 Django phei
3 Data Analysis phei
4 Machine Learning PHEI
5 数据可视化案例 gitchat
6 零基础入手Python gitchat

将数据保存到数据库

先准备要放入数据库的数据

df = pd.read_csv("../data/HospInfo.csv")
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4812 entries, 0 to 4811
Data columns (total 29 columns):
 #   Column                                                         Non-Null Count  Dtype 
---  ------                                                         --------------  ----- 
 0   Provider ID                                                    4812 non-null   int64 
 1   Hospital Name                                                  4812 non-null   object
 2   Address                                                        4812 non-null   object
 3   City                                                           4812 non-null   object
 4   State                                                          4812 non-null   object
 5   ZIP Code                                                       4812 non-null   int64 
 6   County Name                                                    4797 non-null   object
 7   Phone Number                                                   4812 non-null   int64 
 8   Hospital Type                                                  4812 non-null   object
 9   Hospital Ownership                                             4812 non-null   object
 10  Emergency Services                                             4812 non-null   bool  
 11  Meets criteria for meaningful use of EHRs                      4668 non-null   object
 12  Hospital overall rating                                        4812 non-null   object
 13  Hospital overall rating footnote                               1398 non-null   object
 14  Mortality national comparison                                  4812 non-null   object
 15  Mortality national comparison footnote                         1352 non-null   object
 16  Safety of care national comparison                             4812 non-null   object
 17  Safety of care national comparison footnote                    2168 non-null   object
 18  Readmission national comparison                                4812 non-null   object
 19  Readmission national comparison footnote                       1017 non-null   object
 20  Patient experience national comparison                         4812 non-null   object
 21  Patient experience national comparison footnote                1369 non-null   object
 22  Effectiveness of care national comparison                      4812 non-null   object
 23  Effectiveness of care national comparison footnote             1202 non-null   object
 24  Timeliness of care national comparison                         4812 non-null   object
 25  Timeliness of care national comparison footnote                1266 non-null   object
 26  Efficient use of medical imaging national comparison           4812 non-null   object
 27  Efficient use of medical imaging national comparison footnote  2033 non-null   object
 28  Location                                                       4812 non-null   object
dtypes: bool(1), int64(3), object(25)
memory usage: 1.0+ MB

将数据存入数据库

import pymysql
import sqlalchemy
from sqlalchemy import create_engine
connect_info = 'mysql+pymysql://{}:{}@{}:{}/{}'.format('root',
                                                       '********',
                                                       'localhost',
                                                       3306,
                                                       'pymysql')
engine = create_engine(connect_info)
df.to_sql('hosp_info',engine,if_exists='replace')
# if_exists='',replace:删除表,重新创建保存数据(覆盖数据),append:向表中追加数据
4812

看看存放结果

conn = pymysql.connect(host="localhost",    # ①
                       user='root',
                       password='********',
                       db="pymysql")
cursor = conn.cursor()
sql = "select * from hosp_info"


df = pd.read_sql(sql, conn, index_col='Provider ID')
df.head()
D:\CS\Apps\anaconda\anaconda3\envs\ai\lib\site-packages\pandas\io\sql.py:761: UserWarning: pandas only support SQLAlchemy connectable(engine/connection) ordatabase string URI or sqlite3 DBAPI2 connectionother DBAPI2 objects are not tested, please consider using SQLAlchemy
  warnings.warn(
index Hospital Name Address City State ZIP Code County Name Phone Number Hospital Type Hospital Ownership ... Readmission national comparison footnote Patient experience national comparison Patient experience national comparison footnote Effectiveness of care national comparison Effectiveness of care national comparison footnote Timeliness of care national comparison Timeliness of care national comparison footnote Efficient use of medical imaging national comparison Efficient use of medical imaging national comparison footnote Location
Provider ID
10005 0 MARSHALL MEDICAL CENTER SOUTH 2505 U S HIGHWAY 431 NORTH BOAZ AL 35957 MARSHALL 2565938310 Acute Care Hospitals Government - Hospital District or Authority ... None Same as the national average None Same as the national average None Above the national average None Below the national average None 2505 U S HIGHWAY 431 NORTH\nBOAZ, AL\n
10012 1 DEKALB REGIONAL MEDICAL CENTER 200 MED CENTER DRIVE FORT PAYNE AL 35968 DE KALB 2568453150 Acute Care Hospitals Proprietary ... None Same as the national average None Same as the national average None Above the national average None Same as the national average None 200 MED CENTER DRIVE\nFORT PAYNE, AL\n
10032 2 WEDOWEE HOSPITAL 209 NORTH MAIN STREET WEDOWEE AL 36278 RANDOLPH 2563572111 Acute Care Hospitals Government - Hospital District or Authority ... None Not Available Results are not available for this reporting p... Same as the national average None Same as the national average None Not Available Results are not available for this reporting p... 209 NORTH MAIN STREET\nWEDOWEE, AL\n
10095 3 HALE COUNTY HOSPITAL 508 GREEN STREET GREENSBORO AL 36744 HALE 3346243024 Acute Care Hospitals Government - Local ... Results are not available for this reporting p... Not Available Results are not available for this reporting p... Below the national average None Same as the national average None Not Available Results are not available for this reporting p... 508 GREEN STREET\nGREENSBORO, AL\n
10131 4 CRESTWOOD MEDICAL CENTER ONE HOSPITAL DR SE HUNTSVILLE AL 35801 MADISON 2568823100 Acute Care Hospitals Proprietary ... None Same as the national average None Same as the national average None Same as the national average None Same as the national average None ONE HOSPITAL DR SE\nHUNTSVILLE, AL\n

5 rows × 29 columns


网站公告

今日签到

点亮在社区的每一天
去签到