pandasFind data in excel or csv table in the specified line of information (super detailed)
Key !!!! Use the loc function to find it.
Without further ado, let's get right to the demo:
There are the following named tables:
1. According to index query
Condition: First, the imported data must have an index.
Or add it yourself. Easy way to do it.Read excelAdd index_col directly to the file.
Code Example:
import pandas as pd #import pandas library
excel_file = '. /' #Import excel data
data = pd.read_excel(excel_file, index_col='name')
# The index_col of this is the index, you can choose any field as the index index, read in data
print(['Li Si'])
The printout is
Sector B
Wages 6600
Name: the fourth child in the family, dtype: object
(Point of attention: Indexing)
2. Known data in the first line to find the desired data
If we have a table with empty salary data for an employee, how do we find the data we want.
The code is as follows:
for i in :
for j in range(len(data)):
if (data[i].isnull())[j]:
bumen = [j, [0]] #find the sector where the missing value is located
data[i][j] = charuzhi(bumen)
The principle is simple, first retrieve all the data, then we can use the iloc function in pandas. The above iloc [j, [2]] in j is the specific location, [0] is the data you want to get where the column
3. According to the conditions of the query to find the specified row of data
For example, find the names and salaries of all the members of department A or those whose salaries are less than 3,000:
The code is as follows:
"""Query a row of data based on conditions""""
import pandas as pd #import pandas library
excel_file = '. /' #Import file
data = pd.read_excel(excel_file) #read in data
print([data['department'] == 'A', ['name', 'salary']]) #Department is A, print name and salary
print([data['salary'] < 3000, ['name','salary']]) #Find people with salaries less than 3000
The results are as follows:
To generate these data independently as an excel file or csv file:
Add the following code
"""Export to excel or csv file""""
dataframe_1 = [data['Department'] == 'A', ['Name', 'Salary']]
dataframe_2 = [data['salary'] < 3000, ['name', 'salary']]
dataframe_1.to_excel('dataframe_1.xlsx')
dataframe_2.to_excel('dataframe_2.xlsx')
4. Find the specified column
data['columns'] #columns i.e. the names of the fields you need will do.
# Note that the columns of this column cannot be the name of the index
# If you want to print the index then
# Same as above
The libraries used for the whole process above:
pandas,xlrd , openpyxl
5. Find the specified rows and specified columns
The main function used is the iloc
[:,:2] #i.e. all rows, first two columns of data
It's easy to understand the range of rows before the comma and columns after the comma
6. Identify eligible data within the specified range
[:10,:][data.wages>6000]
This will find all the people in the first 11 rows with a salary greater than 6000.