Python Archives - Celery-Q https://celeryq.org Programming Blog Thu, 21 Sep 2023 07:18:11 +0000 en-US hourly 1 https://wordpress.org/?v=6.0 https://celeryq.org/wp-content/uploads/2022/07/favicon-230x230.png Python Archives - Celery-Q https://celeryq.org 32 32 Python Cosine Similarity: Mastering for Practical Use https://celeryq.org/python-cosine-similarity/ https://celeryq.org/python-cosine-similarity/#respond Thu, 21 Sep 2023 07:18:07 +0000 https://celeryq.org/?p=325 Welcome to our in-depth exploration of Python cosine similarity, a vital concept with broad applications in data analysis, text processing, and machine learning. In this comprehensive article, we aim to demystify cosine similarity, providing you with a deep understanding of its real-world uses and how to implement it using Python.  Whether you are new to […]

The post Python Cosine Similarity: Mastering for Practical Use appeared first on Celery-Q.

]]>
0 0
Read Time:4 Minute, 32 Second

Welcome to our in-depth exploration of Python cosine similarity, a vital concept with broad applications in data analysis, text processing, and machine learning. In this comprehensive article, we aim to demystify cosine similarity, providing you with a deep understanding of its real-world uses and how to implement it using Python. 

Whether you are new to programming or an experienced coder, this article equips you with the knowledge and skills needed to harness the potential of cosine similarity effectively.

Understanding Cosine Similarity

Core Concepts

Cosine similarity is a mathematical tool used to quantify the similarity between two non-zero vectors in a multi-dimensional space. Put simply, it helps us measure how similar or dissimilar two sets of data points are, making it invaluable in various fields.

The Essence of Cosine Similarity

Picture two vectors, A and B, in a multi-dimensional space. Cosine similarity between these vectors is represented by the cosine of the angle θ formed between them. The closer θ is to 0 degrees, the more similar the vectors are. If θ equals 90 degrees, indicating orthogonality, the vectors are dissimilar.

The Cosine Similarity Formula

Mathematically, cosine similarity can be expressed as follows:

cosine_similarity(A, B) =  (A ⋅ B) / (∥A∥ * ∥B∥)

Where:

  •  A ⋅ B represents the dot product of vectors A and B;
  • ∥A∥ and ∥B∥ denote the magnitudes (or norms) of vectors A and B, respectively.

Practical Applications of Cosine Similarity

Cosine similarity has a wide range of applications. Let’s delve into some key areas where it plays a significant role:

Text Similarity and Information Retrieval

In the realm of natural language processing (NLP), cosine similarity is extensively used to gauge the similarity between textual documents. This aids in tasks like document retrieval, plagiarism detection, and recommendation systems, making text comparisons efficient by representing documents as vectors.

Recommender Systems

Big players in e-commerce like Amazon and Netflix leverage cosine similarity to recommend products or movies to users. By analyzing user preferences and item descriptions as vectors, these platforms provide personalized recommendations, enhancing the user experience.

Learn more in the next tutorial:

Clustering and Classification

Cosine similarity is fundamental in clustering and classification tasks, helping group similar data points for pattern recognition and data organization.

Image Processing

In the field of computer vision, cosine similarity assists in image matching and retrieval. By converting images into feature vectors, visually similar images can be identified within large databases.

 Implementing Cosine Similarity in Python

Now that we’ve covered the core concepts and practical applications of cosine similarity, let’s explore its Python implementation.

Calculating Cosine Similarity in Python

Python offers several libraries, including NumPy and Scikit-Learn, to compute cosine similarity efficiently. We’ll walk you through both methods, providing code examples for clarity.

Using NumPy

NumPy, a robust numerical computing library, simplifies cosine similarity calculations. Here’s a code snippet demonstrating its use:

import numpy as np

# Define two vectors, A and B

A = np.array([1, 2, 3])

B = np.array([4, 5, 6])

# Calculate cosine similarity

similarity = np.dot(A, B) / (np.linalg.norm(A) * np.linalg.norm(B))

Manual Implementation

For a deeper understanding of the underlying mathematics, you can implement cosine similarity manually:

def cosine_similarity(A, B):
    dot_product = sum(a * b for a, b in zip(A, B))
    magnitude_A = sum(a * a for a in A) ** 0.5
    magnitude_B = sum(b * b for b in B) ** 0.5
    return dot_product / (magnitude_A * magnitude_B)

Choosing the Appropriate Cosine Similarity Variant

TF-IDF (Term Frequency-Inverse Document Frequency)

In NLP, especially when working with text data, the TF-IDF representation is commonly used to calculate cosine similarity. TF-IDF considers term frequency and inverse document frequency to weigh the importance of terms in distinguishing documents. Scikit-Learn provides a straightforward TF-IDF vectorizer:

from sklearn.feature_extraction.text import TfidfVectorizer

# Sample documents
documents = ["This is the first document.",
             "This document is the second document.",
             "And this is the third one.",
             "Is this the first document?"]

# Create TF-IDF vectors
tfidf_vectorizer = TfidfVectorizer()
tfidf_matrix = tfidf_vectorizer.fit_transform(documents)

# Calculate cosine similarity
cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)

Cosine Similarity for Sparse Data

When working with high-dimensional and sparse data, such as text data, it’s crucial to consider memory and computational efficiency. SciPy offers optimized functions for calculating cosine similarity on sparse matrices:

from scipy.spatial.distance import cosine

# Sparse matrices A and B
sparse_A = scipy.sparse.csr_matrix(A)
sparse_B = scipy.sparse.csr_matrix(B)

# Calculate cosine similarity
similarity = 1 - cosine(sparse_A, sparse_B)

Conclusion

We have embarked on an insightful journey into the world of Python cosine similarity. We started with the fundamentals, understanding how it quantifies the similarity between vectors in multi-dimensional spaces. We then explored its extensive applications, spanning text similarity, recommendation systems, clustering, and image analysis.

You’ve gained practical insights into implementing cosine similarity in Python, whether through the NumPy library or manual calculations. We also discussed specialized cases like TF-IDF and efficient handling of sparse data.

With this knowledge at your disposal, you are now well-prepared to tackle real-world data analysis, natural language processing, and machine learning tasks with confidence. Python cosine similarity is a versatile tool that can enhance the quality and efficiency of your projects. So, go ahead, experiment, and unlock its potential in your programming endeavors.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

The post Python Cosine Similarity: Mastering for Practical Use appeared first on Celery-Q.

]]>
https://celeryq.org/python-cosine-similarity/feed/ 0
Python Age Calculator: Guide For You  https://celeryq.org/python-age-calculator/ https://celeryq.org/python-age-calculator/#respond Thu, 21 Sep 2023 07:09:11 +0000 https://celeryq.org/?p=322 In this tutorial, we’ll dive into creating a simple Python program to calculate a user’s age. Whether you’re a beginner just starting to learn Python or someone looking to embark on their first small project, this tutorial is designed for you. We’ll harness the power of Python’s built-in datetime module to accomplish this task, so […]

The post Python Age Calculator: Guide For You  appeared first on Celery-Q.

]]>
0 0
Read Time:3 Minute, 48 Second

In this tutorial, we’ll dive into creating a simple Python program to calculate a user’s age. Whether you’re a beginner just starting to learn Python or someone looking to embark on their first small project, this tutorial is designed for you. We’ll harness the power of Python’s built-in datetime module to accomplish this task, so there’s no need to install any additional dependencies.

Building the Age Calculator in Python

Let’s begin by importing the necessary module:

# Import the required dependencyfrom datetime import date

Next, we’ll prompt the user to input their birthday and create a datetime object with that information:

# Ask the user to input year, month, and day of their birthday# and output a datetime object of their birthdaydef get_user_birthday():    birth_year = int(input(‘Please enter your birth year: ‘))    birth_month = int(input(‘Please enter your birth month: ‘))    birth_day = int(input(‘Please enter your birth day: ‘))
    # Convert user input into a datetime object    user_birthday = date(birth_year, birth_month, birth_day)
    return user_birthday

Now, we can proceed to calculate the user’s age:

# Define a function to calculate the user’s agedef calculate_age(user_birthday):    # Get the current date    today = date.today()    # Calculate the years difference    year_diff = today.year – user_birthday.year    # Check if the birth month and birth day precede the current month and current day    precedes_flag = ((today.month, today.day) < (user_birthday.month, user_birthday.day))    # Calculate the user’s age    age = year_diff – precedes_flag    # Return the age value    return age

Finally, we can execute the code:

if __name__ == “__main__”:    user_birthday = get_user_birthday()    current_age = calculate_age(user_birthday)    print(f”Your age is: {current_age}”)

Testing the code with a sample birthday:

Please enter your birth year: 1999Please enter your birth month: 02Please enter your birth day: 01

You should get:

Your age is: 23

Complete Code

Here’s the complete code for your reference:

# Import the required dependencyfrom datetime import date
# Ask the user to input year, month, and day of their birthday# and output a datetime object of their birthdaydef get_user_birthday():    birth_year = int(input(‘Please enter your birth year: ‘))    birth_month = int(input(‘Please enter your birth month: ‘))    birth_day = int(input(‘Please enter your birth day: ‘))
    # Convert user input into a datetime object    user_birthday = date(birth_year, birth_month, birth_day)
    return user_birthday
# Define a function to calculate the user’s agedef calculate_age(user_birthday):    # Get the current date    today = date.today()    # Calculate the years difference    year_diff = today.year – user_birthday.year    # Check if the birth month and birth day precede the current month and current day    precedes_flag = ((today.month, today.day) < (user_birthday.month, user_birthday.day))    # Calculate the user’s age    age = year_diff – precedes_flag    # Return the age value    return age
if __name__ == “__main__”:    user_birthday = get_user_birthday()    current_age = calculate_age(user_birthday)    print(f”Your age is: {current_age}”)

More Precise Method

The more precise method calculates the age considering the birth year, month, and day, making it more accurate.

Now, let’s implement both methods and see the differences.

MethodCalculationPrecision
Simple MethodCurrent Year – Birth YearLess Precise
Precise MethodCurrent Date – Birth DateMore Precise

Video Explanation 

In order to explain this topic in more detail, we have prepared a special video for you. Enjoy watching it!

Conclusion

In this tutorial, we explored two methods for calculating your age using Python. You can choose between a simple method for a quick result or a more precise method for accuracy. Feel free to leave comments below if you have any questions or suggestions for improvements. Happy coding!

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

The post Python Age Calculator: Guide For You  appeared first on Celery-Q.

]]>
https://celeryq.org/python-age-calculator/feed/ 0
Unlocking Python’s Full Potential with Executable Files https://celeryq.org/convert-python-to-exe/ https://celeryq.org/convert-python-to-exe/#respond Thu, 21 Sep 2023 06:54:16 +0000 https://celeryq.org/?p=318 For many Python users, ensuring the accessibility and reproducibility of their projects within their community or organization is paramount. However, not everyone is well-versed in Python, and not everyone has it installed on their systems. In such cases, the convenience of working with familiar file extensions like .EXE becomes evident. In this comprehensive guide, we […]

The post Unlocking Python’s Full Potential with Executable Files appeared first on Celery-Q.

]]>
0 0
Read Time:5 Minute, 33 Second

For many Python users, ensuring the accessibility and reproducibility of their projects within their community or organization is paramount. However, not everyone is well-versed in Python, and not everyone has it installed on their systems. In such cases, the convenience of working with familiar file extensions like .EXE becomes evident.

In this comprehensive guide, we will delve into the process of taking a simple Python project and transforming it into a Windows program in .EXE format. This transformation allows users to run the application on their computers without the need to install all the associated dependencies.

To follow along with this tutorial, you will need a Python library called PyInstaller.

If you haven’t already installed it, you can do so by opening the “Command Prompt” on Windows and using the following command:

pip install pyinstaller

Sample Python Code

Before we proceed with converting our Python code to an executable file, we need some Python code to work with. In this tutorial, we will use a Python program that creates a simple GUI calculator using the Tkinter library.

Here is the Python code we’ll be working with:

# Python code for the calculator application…# …

Save this code in a file named calculator.py.

You can test this application by running it using the Python interpreter:

python calculator.py

This will launch the calculator GUI, and you can use it as a regular calculator.

Convert .PY File to . EXE File – Basic

Now that we have our Python code ready, we can proceed to convert it into an executable (.EXE) file. The basic conversion process is relatively straightforward.

First, open your terminal or command prompt and navigate to the directory where your calculator.py file is located. You should be in the same directory as the Python file.

Next, run the following command:

pyinstaller calculator.py

This command tells PyInstaller to create an executable file from calculator.py. The process will take some time as it bundles the Python application and its dependencies into a single package with an executable file.

After the process is complete, you will find the generated .EXE file in the dist/calculator folder. It will have the same name as your Python file, which is calculator.exe in this case.

You can double-click on calculator.exe, and the calculator GUI will open. However, please note that a console window will also appear for standard input/output. If you want a standalone calculator application without the console window, proceed to the advanced section.

Convert .PY File to .EXE File – Advanced

In this advanced section, we will customize the conversion process to achieve two goals:

  1. Create a one-file bundled executable;
  2. Remove the console window.

Let’s go back to the original structure of the working directory where we have the calculator.py file.

By default, PyInstaller creates a one-folder bundle containing an executable file and various other files and folders. To simplify this and create a one-file bundled executable, we use the –onefile option:

pyinstaller –onefile calculator.py

Additionally, we can remove the console window that appears when running the executable by using the –noconsole option:

pyinstaller –onefile –noconsole calculator.py

After running this command, you will find the standalone calculator application in the dist folder. There will be only one executable file, and when you double-click on it, the calculator GUI will open without the console window.

Conversion Options Comparison

Here, we compare two different approaches to converting Python code to executable (.EXE) files: the basic method and the advanced method.

Conversion AspectBasic MethodAdvanced Method
Executable OutputSingle executable file (calculator.exe) in dist/calculator folder.Single executable file (calculator.exe) in dist folder.
Console WindowConsole window appears for standard input/output.No console window; standalone executable.
Simplicity of DirectoryMultiple files and folders generated.Fewer files, simpler directory structure.
Conversion Commandpyinstaller calculator.pypyinstaller –onefile –noconsole calculator.py
Usage for DistributionSuitable for distribution with console interface.Suitable for standalone applications.

Video Explanation 

In order to explain this topic in more detail, we have prepared a special video for you. Enjoy watching it!

User-Friendly Interface

One of the key factors in making your Python executable appealing to users is the interface. While Python excels in functionality, aesthetics often require a little extra effort. Here are some tips:

  • Graphical User Interface (GUI): Whenever possible, create a graphical user interface for your Python application. Libraries like Tkinter, PyQt, and Kivy allow you to design interactive and visually pleasing interfaces;
  • Responsive Design: Ensure your GUI elements are well-arranged and responsive. Elements should adjust gracefully when users resize the application window;
  • Intuitive Controls: Label buttons, fields, and menus clearly. Use intuitive icons and tooltips to guide users;
  • Error Handling: Implement error messages and dialogs that provide informative feedback to users when something goes wrong.

Documentation and Help

Users often appreciate clear documentation and built-in help features. Here’s how to provide them:

  • User Manual: Create a user manual or documentation that explains how to use your Python executable. Include step-by-step instructions, examples, and troubleshooting tips;
  • In-App Help: Incorporate an in-app help feature that provides information on specific functions or features. Tooltips, pop-up guides, or a dedicated help menu can be helpful.
A person is engaged in programming

Conclusion 

Creating a user-friendly Python executable involves more than just converting code into an .EXE file. It requires attention to user interface design, documentation, performance optimization, regular updates, user feedback, and security. By implementing these tips and tricks, you can elevate your Python executable’s user experience and increase user satisfaction.

Unlock the full potential of your Python applications and provide users with a seamless and enjoyable experience.

FAQ

1. What is a Python executable?

A Python executable is a standalone application that can run Python scripts without requiring users to have Python installed. It typically has a .EXE file extension on Windows.

2. Why convert Python code to an executable?

Converting Python code to an executable makes it accessible to users who may not have Python installed. It simplifies distribution and usage of Python applications.

3. How can I convert a Python script to an executable?

You can use tools like PyInstaller, cx_Freeze, py2exe, or py2app to convert Python scripts into executable files. PyInstaller is a popular choice for cross-platform compatibility.

4. What is PyInstaller?

PyInstaller is a widely used Python library that bundles Python applications and their dependencies into a single executable file. It simplifies the process of creating standalone Python applications.

5. Can I create GUI applications with Python executables?

Yes, you can create graphical user interface (GUI) applications using libraries like Tkinter, PyQt, or Kivy, and then convert them into Python executables.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

The post Unlocking Python’s Full Potential with Executable Files appeared first on Celery-Q.

]]>
https://celeryq.org/convert-python-to-exe/feed/ 0
Normality Tests in Python: Assessing Data Distribution https://celeryq.org/normality-test-python/ https://celeryq.org/normality-test-python/#respond Thu, 21 Sep 2023 06:47:14 +0000 https://celeryq.org/?p=314 In this comprehensive guide, we dive into the world of normality tests in Python, exploring various statistical methods to assess whether a dataset follows a normal distribution. We’ll use a real-world example, analyzing the weekly historical returns of Microsoft stock. By the end of this tutorial, you’ll gain insights into the Jarque-Bera test, Kolmogorov-Smirnov test, […]

The post Normality Tests in Python: Assessing Data Distribution appeared first on Celery-Q.

]]>
0 0
Read Time:5 Minute, 58 Second

In this comprehensive guide, we dive into the world of normality tests in Python, exploring various statistical methods to assess whether a dataset follows a normal distribution. We’ll use a real-world example, analyzing the weekly historical returns of Microsoft stock. By the end of this tutorial, you’ll gain insights into the Jarque-Bera test, Kolmogorov-Smirnov test, Anderson-Darling test, and Shapiro-Wilk test, along with practical Python implementations.

Sample Data for Normality Testing

Let’s begin by setting up our sample data. We’ll be working with weekly historical returns for Microsoft stock from January 1, 2018, to December 31, 2018. This dataset can be easily obtained from Yahoo! Finance.

import pandas as pd
# Load data from a CSV filedf = pd.read_csv(‘MSFT.csv’)
# Select relevant columnsdf = df[[‘Date’, ‘Close’]]

We’ll convert stock prices into returns, a common practice in financial analysis. Then, we’ll visualize the data with a histogram.

import numpy as npimport matplotlib.pyplot as plt
# Calculate returnsdf[‘diff’] = pd.Series(np.diff(df[‘Close’]))df[‘return’] = df[‘diff’] / df[‘Close’]
# Drop missing valuesdf = df[[‘Date’, ‘return’]].dropna()
# Visualize data with a histogramplt.hist(df[‘return’])plt.show()

Quantile-Quantile (Q-Q) Plot

We start our exploration with a Q-Q plot, a visual method to assess the normality of data. This plot compares observed quantiles against theoretical quantiles of a normal distribution. Deviations from a straight line indicate non-normality.

import pylabimport scipy.stats as stats
# Create a Q-Q plotstats.probplot(df[‘return’], dist=”norm”, plot=pylab)pylab.show()

The Q-Q plot shows a reasonably linear relationship, suggesting that the data approximates normality but is not perfect.

Jarque-Bera Test

The Jarque-Bera test assesses whether a dataset’s skewness and kurtosis match those of a normal distribution. A high test statistic indicates significant deviation from normality.

from scipy.stats import jarque_bera
result = jarque_bera(df[‘return’])

In our case, the test statistic is 1.94, and the p-value is approximately 0.38. Since the p-value is greater than 0.05, we fail to reject the null hypothesis, suggesting that the data is normally distributed.

Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov test is a non-parametric test that compares the empirical distribution function (ECDF) of the data to that of a theoretical distribution, such as the normal distribution.

from scipy.stats import kstest
result = kstest(df[‘return’], cdf=’norm’)

Here, the K-S statistic is 0.47, and the p-value is close to zero. With a small p-value, we reject the null hypothesis, indicating that the data does not follow a normal distribution.

Anderson-Darling Test

The Anderson-Darling test is a modification of the K-S test, providing more weight to the tails of the distribution. It is sensitive to deviations in both the tails and the center of the distribution.

from scipy.stats import anderson
result = anderson(df[‘return’], dist=’norm’)

The A-D statistic is 0.37, and critical values are provided for significance levels ranging from 15% to 1%. We find that our data fails to reject the null hypothesis, suggesting normality.

Shapiro-Wilk Test

The Shapiro-Wilk test assesses whether a dataset is significantly different from a normal distribution. It is particularly suitable for small sample sizes.

from scipy.stats import shapiro
result = shapiro(df[‘return’])

With a Shapiro-Wilk statistic of 0.98 and a p-value of approximately 0.42, we fail to reject the null hypothesis, indicating that the data is not significantly different from a normal distribution.

Laptop with a program for programming

Comparing Normality Test Results

Here’s a summary of the normality test results for our Microsoft stock returns data:

TestH0: Normality
Jarque-BeraFail to reject
Kolmogorov-SmirnovReject
Anderson-DarlingFail to reject
Shapiro-WilkFail to reject

Video Explanation 

In order to explain this topic in more detail, we have prepared a special video for you. Enjoy watching it!

Advantages and Disadvantages of Different Normality Tests

In the realm of statistics and data analysis, testing for normality is a crucial step before applying various statistical methods. In this article, we have explored four common normality tests in Python: the Jarque-Bera test, Kolmogorov-Smirnov test, Anderson-Darling test, and Shapiro-Wilk test. Each of these tests serves a unique purpose and has its own set of advantages and disadvantages.

Jarque-Bera Test

Advantages:

  • Relatively easy to implement;
  • Suitable for large datasets;
  • Accounts for skewness and kurtosis.

Disadvantages:

  • Less powerful for small sample sizes;
  • Assumes independence of observations;
  • Kolmogorov-Smirnov Test

Advantages:

  • Non-parametric, making it distribution-free;
  • Suitable for comparing a sample distribution to a known distribution.

Disadvantages:

  • Less powerful for small sample sizes;
  • Focuses on the maximum difference between ECDFs, which may not capture subtle departures from normality.

Anderson-Darling Test

Advantages:

  • Powerful test, especially for small sample sizes;
  • Provides critical values for different significance levels;
  • Takes into account observations across the entire dataset.

Disadvantages:

  • Requires pre-defined significance levels for interpretation;
  • Sensitive to outliers;
  • Shapiro-Wilk Test

Advantages:

  • Works well for small to moderately-sized datasets;
  • Offers a p-value for hypothesis testing.

Disadvantages:

  • May lead to Type II errors with large sample sizes;
  • Sensitive to deviations in the tails of the distribution.

When choosing a normality test, it’s essential to consider the characteristics of your dataset, such as sample size and potential outliers. Each of these tests can be a valuable tool in assessing whether your data follows a normal distribution. However, no single test is universally superior, and the choice often depends on the specific context of your analysis.

Conclusion

In this comprehensive guide, we explored various normality tests in Python and applied them to real-world data. While different tests yielded varying results, it’s essential to consider the nature of your data and the specific requirements of your analysis when selecting a normality test. Understanding data distribution is a crucial step in many statistical analyses, as it can impact the validity of statistical tests and the choice of modeling techniques.

FAQ

1. What is normality testing?

Normality testing is a statistical process used to determine whether a dataset follows a normal or Gaussian distribution. It helps assess whether a dataset’s values are symmetrically distributed around the mean and if they exhibit the characteristic bell-shaped curve of a normal distribution.

2. Why is normality testing important?

Normality testing is essential in statistics because many statistical methods assume that the data being analyzed follows a normal distribution. Validating the normality assumption is crucial to ensure the accuracy of these methods. If the data is not normal, it may be necessary to use alternative statistical techniques.

3. What does it mean when a normality test fails?

If a normality test indicates that your dataset does not follow a normal distribution, it suggests that the assumptions of many parametric statistical tests may not be met. In such cases, consider using non-parametric tests or transformations to analyze your data.

4. Can I rely solely on normality tests to assess data distribution?

Normality tests are valuable tools, but they should be used in conjunction with visualizations (e.g., histograms and Q-Q plots) and domain knowledge. No single test can provide a definitive answer, and it’s essential to interpret the results in context.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

The post Normality Tests in Python: Assessing Data Distribution appeared first on Celery-Q.

]]>
https://celeryq.org/normality-test-python/feed/ 0
How to Standardize Data in Python https://celeryq.org/how-to-standardize-data/ https://celeryq.org/how-to-standardize-data/#respond Thu, 21 Sep 2023 06:40:07 +0000 https://celeryq.org/?p=310 In the world of machine learning, one of the initial steps in feature engineering is data standardization. It’s crucial to ensure that your data is appropriately scaled, especially when dealing with various machine-learning models. Some models, such as linear regression, K-nearest neighbors (KNN), and support vector machines (SVM), are highly sensitive to features with different […]

The post How to Standardize Data in Python appeared first on Celery-Q.

]]>
0 0
Read Time:4 Minute, 32 Second

In the world of machine learning, one of the initial steps in feature engineering is data standardization. It’s crucial to ensure that your data is appropriately scaled, especially when dealing with various machine-learning models. Some models, such as linear regression, K-nearest neighbors (KNN), and support vector machines (SVM), are highly sensitive to features with different scales. On the other hand, models like decision trees, bagging, and boosting algorithms typically don’t require data scaling.

The Impact of Data Scaling

The impact of feature scaling on machine learning models is significant. Features with larger value ranges tend to dominate the decision-making process of algorithms because their effects on the outputs are more pronounced. To level the playing field and make sure all features are equally considered during model training, we turn to feature scaling techniques.

Two Popular Feature Scaling Techniques

Two of the most commonly used feature scaling techniques are:

  1. Z-Score Standardization: Also known as mean normalization, this technique scales the data based on the mean and standard deviation of the dataset;
  2. Min-Max Normalization: This method scales the data between a specified range, typically 0 and 1.

In this article, we will delve into how to perform z-score standardization of data using Python, specifically utilizing the sklearn and pandas libraries.

Understanding Standardization

In statistics and machine learning, data standardization involves converting data into z-score values, which are based on the mean and standard deviation of the dataset. Essentially, each data point within a feature is transformed into a representative number of standard deviations it deviates from the mean. The result is a dataset with a mean of 0 and values generally ranging between -3 and +3, assuming a normal distribution of data (which holds for approximately 99.9% of data points).

The z-score formula for a given observation x within a feature is as follows:

z = (x – mean) / standard deviation

Let’s consider a simple example to illustrate the concept of standardization.

Standardization Example

Suppose we have a dataset with two features: “Weight” in grams and “Price” in dollars.

Weight (g)Price ($)
3003
2502
8005

The weights range from 250g to 800g, while prices range from $2 to $5. These different scales make direct comparisons challenging.

Standardizing the Data

We’ll start by standardizing the “Weight” feature:

  • Mean of Weight: 450g;
  • Standard Deviation of Weight: 248.3277g

For the first observation (Weight = 300g), the z-score is calculated as follows:

z = (300 – 450) / 248.3277 = -0.81

For the second observation (Weight = 250g):

z = (250 – 450) / 248.3277 = -1.064

And for the third observation (Weight = 800g):

z = (800 – 450) / 248.3277 = 1.336

We perform similar calculations for the “Price” feature using its mean and standard deviation.

Standardized Dataset

After standardization, our dataset now looks like this:

Weight (standardized)Price (standardized)
-0.604-0.267
-0.805-1.069
1.4091.336

The scales of the features have been aligned, making data visualization and analysis more meaningful.

How to Standardize Data in Python

To perform data standardization in Python, we’ll use the StandardScaler class from the sklearn library. First, let’s create a sample dataset as shown earlier.

import pandas as pd
data = {‘Weight (g)’: [300, 250, 800],        ‘Price ($)’: [3, 2, 5]}
df = pd.DataFrame(data)

Now, let’s standardize the data:

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()standardized_data = scaler.fit_transform(df)
standardized_df = pd.DataFrame(standardized_data, columns=df.columns)

The standardized_df now contains the standardized data.

Video Explanation 

In order to explain this topic in more detail, we have prepared a special video for you. Enjoy watching it!

Advantages of Data Standardization

Standardizing data in machine learning has several advantages that contribute to improved model performance and robustness. Here are some key benefits of data standardization:

1. Improved Model Convergence

When dealing with features of varying scales, machine learning algorithms can take longer to converge or may even fail to converge. Standardizing the data ensures that features are on a common scale, making it easier for optimization algorithms to find the optimal model parameters.

2. Enhanced Model Interpretability

Standardized data simplifies the interpretation of model coefficients. In linear models, for example, the coefficients represent the change in the target variable for a one-unit change in the corresponding feature. With standardized features, these coefficients become more meaningful and directly comparable.

3. Robustness to Outliers

Machine learning models can be sensitive to outliers in the data. Standardization helps mitigate this sensitivity by reducing the impact of extreme values. Features with large scales can dominate the influence of outliers, but after standardization, their influence is limited to a few standard deviations.

4. Better Model Generalization

Standardized data often leads to models that generalize better to unseen data. When features are on a similar scale, the model can make predictions that are more consistent across different subsets of the data, resulting in improved generalization performance.

5. Compatibility with Regularization Techniques

Regularization techniques like L1 and L2 regularization assume that all features have similar scales. Standardization aligns features with these assumptions, allowing regularization to work effectively in controlling model complexity.

Two men are engaged in programming

Conclusion

Data standardization is a critical step in preparing your data for machine learning. By scaling features appropriately, you ensure that your models are not biased toward any particular feature due to its scale. In this tutorial, we explored how to standardize data in Python using the z-score standardization technique.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

The post How to Standardize Data in Python appeared first on Celery-Q.

]]>
https://celeryq.org/how-to-standardize-data/feed/ 0
PyShark: What It Is And How To Use  https://celeryq.org/pyshark/ https://celeryq.org/pyshark/#respond Thu, 21 Sep 2023 06:34:51 +0000 https://celeryq.org/?p=305  In the world of network programming and analysis, efficient packet parsing is essential. Python offers various libraries for this purpose, and PyShark stands out as a unique option. This article explores PyShark, a Python wrapper for tshark, Wireshark’s command-line utility. PyShark enables Python developers to parse packets using Wireshark dissectors, providing a powerful tool for […]

The post PyShark: What It Is And How To Use  appeared first on Celery-Q.

]]>
0 0
Read Time:5 Minute, 34 Second

 In the world of network programming and analysis, efficient packet parsing is essential. Python offers various libraries for this purpose, and PyShark stands out as a unique option. This article explores PyShark, a Python wrapper for tshark, Wireshark’s command-line utility. PyShark enables Python developers to parse packets using Wireshark dissectors, providing a powerful tool for network analysis.

What Is PyShark?

PyShark is a Python utility and library designed to parse packets using Wireshark dissectors. Unlike some other packet parsing modules, PyShark doesn’t directly parse packets; instead, it leverages tshark’s ability to export XMLs and uses them for parsing. This approach allows PyShark to use all installed Wireshark dissectors, making it a versatile choice for network analysis.

Installation

Before diving into PyShark, you need to install it. PyShark supports Python 3.7 and higher. You can install it using pip with the following command:

pip install pyshark

Alternatively, you can clone the PyShark repository from GitHub and install it manually:

git clone https://github.com/KimiNewt/pyshark.gitcd pyshark/srcpython setup.py install

Usage

Reading from a Capture File

One common use case for PyShark is parsing packets from a capture file. Here’s how you can do it:

import pyshark
cap = pyshark.FileCapture(‘/path/to/your/capture_file.cap’)for packet in cap:    print(packet)

Reading from a Live Interface

PyShark can also capture packets from a live network interface. Here’s an example:

capture = pyshark.LiveCapture(interface=’eth0′)capture.sniff(timeout=50)
for packet in capture:    print(‘Just arrived:’, packet)

Filtering Packets

PyShark allows you to filter packets, either when reading from a capture file or a live interface. You can use BPF (Berkeley Packet Filter) filters or display filters to narrow down the packets you’re interested in. Here’s an example using a display filter:

filtered_cap = pyshark.FileCapture(‘/path/to/your/capture_file.cap’, display_filter=’http’)for packet in filtered_cap:    print(packet)

Accessing Packet Data

Accessing packet data is straightforward with PyShark. Packets are divided into layers, and you can access their attributes. For instance:

packet[‘ip’].dst  # By protocol stringpacket.ip.src     # By protocol attributepacket[2].src     # By layer index

You can also use the dir(packet.my_layer) command to see available attributes for a layer.

Decrypting Packet Captures

PyShark supports automatic decryption of traces using standards like WEP, WPA-PWD, and WPA-PSK. For example:

capture = pyshark.FileCapture(‘/path/to/your/encrypted_capture.cap’, decryption_key=’your_decryption_key’)for packet in capture:    print(packet)

Comparison Table 

FeaturePySharkScapydpkt
Parsing from CaptureSupportedSupportedSupported
Parsing from LiveSupportedSupportedSupported
InstallationEasy (pip install pyshark)Easy (pip install scapy)Easy (pip install dpkt)
Protocol SupportWireshark DissectorsCustom Parsing and CraftingCustom Parsing
Filtering CapabilitiesDisplay Filters and BPF FiltersCustom Filters and ConditionsCustom Filters and Conditions
Decryption SupportWEP, WPA-PWD, WPA-PSKNot Built-inNot Built-in
Layer-Based ParsingYesYesYes
Active DevelopmentYesYesLimited
CompatibilityCross-platformCross-platformCross-platform

Here’s a comparative table outlining key features of PyShark, Scapy, and dpkt for packet parsing and analysis in Python. Each library has its strengths and use cases, so choosing the right one depends on your specific needs and preferences.

Video Explanation 

In order to explain this topic in more detail we have prepared a special video for you. Enjoy watching it!

Key Advantages of PyShark

  • User-Friendly: PyShark offers a straightforward and user-friendly interface for parsing and analyzing network packets, making it suitable for both beginners and experienced developers;
  • Wireshark Integration: PyShark leverages the power of Wireshark dissectors, allowing you to access detailed information about various network protocols effortlessly;
  • Cross-Platform: It works seamlessly on both Windows and Linux operating systems, providing flexibility in your choice of development environment;
  • Decryption Support: PyShark supports automatic decryption of traces using encryption standards such as WEP, WPA-PWD, and WPA-PSK;
  • Active Development: The library is actively maintained, ensuring that it stays up-to-date with the latest developments in network protocols and technologies;
  • Layer-Based Parsing: You can access packet data at different layers, simplifying the process of extracting information from complex network traffic;
  • Filtering Capabilities: PyShark supports both display filters and BPF filters, enabling you to focus on specific packet subsets for in-depth analysis;
  • Versatile Usage: Whether you’re reading from a capture file, a live interface, or a remote interface, PyShark provides the necessary tools to handle various scenarios;
  • Protocol Support: It covers a wide range of protocols thanks to Wireshark’s extensive dissectors, making it suitable for diverse network analysis tasks;
  • Ease of Installation: Installing PyShark is straightforward, as it can be easily installed using the pip package manager;
  • Community and Documentation: PyShark benefits from an active community of users and has extensive documentation available to assist users in getting started and troubleshooting.
A person is engaged in programming

Conclusion 

In conclusion, PyShark stands as a versatile and powerful Python library for network packet parsing and analysis. Its integration with Wireshark’s dissectors grants users access to detailed network protocol information, simplifying the often complex task of network analysis. With cross-platform compatibility, PyShark can be seamlessly employed on both Windows and Linux systems, offering flexibility to developers.

One of its standout features is its support for automatic decryption of traces using encryption standards like WEP, WPA-PWD, and WPA-PSK. This capability enhances its utility for various network security and monitoring applications.

Moreover, PyShark provides a user-friendly interface that accommodates both novice and experienced users. Its layer-based parsing approach allows for precise data extraction from network traffic, and the filtering capabilities, including display and BPF filters, enable focused analysis.

FAQ

1. What is PyShark, and how does it differ from other packet parsing libraries?

PyShark is a Python wrapper for TShark, leveraging Wireshark’s powerful dissectors to parse network packets. Unlike other packet parsing libraries, PyShark doesn’t parse packets itself; instead, it utilizes TShark’s ability to export XML data for parsing. This approach allows PyShark to provide extensive protocol support without needing to reinvent the wheel.

2. Which Python versions are supported by PyShark?

PyShark supports Python 3.7 and above. There is also a legacy version called “pyshark-legacy” available for Python 2.

3. Can PyShark capture live network traffic, and how does it work?

Yes, PyShark can capture live network traffic from a specified interface. It does this by invoking TShark with the selected interface to capture packets in real time. Users can apply display filters to focus on specific traffic.

4. How does PyShark handle packet decryption?

PyShark offers automatic decryption support for network traces using encryption standards such as WEP, WPA-PWD, and WPA-PSK. By specifying the encryption type and key, PyShark can decrypt captured encrypted traffic for analysis.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

The post PyShark: What It Is And How To Use  appeared first on Celery-Q.

]]>
https://celeryq.org/pyshark/feed/ 0
Unlocking Tables in HTML: Retrieving Tabular Data https://celeryq.org/extract-table-from-html/ https://celeryq.org/extract-table-from-html/#respond Thu, 21 Sep 2023 06:28:58 +0000 https://celeryq.org/?p=301 In the digital age, information is abundant, but often locked within the confines of HTML documents. As data-driven decision-making becomes increasingly vital, the ability to liberate structured information from web pages has become a valuable skill. In this tutorial, we embark on an exploration of how to harness the power of Python to extract tables […]

The post Unlocking Tables in HTML: Retrieving Tabular Data appeared first on Celery-Q.

]]>
0 0
Read Time:7 Minute, 23 Second

In the digital age, information is abundant, but often locked within the confines of HTML documents. As data-driven decision-making becomes increasingly vital, the ability to liberate structured information from web pages has become a valuable skill. In this tutorial, we embark on an exploration of how to harness the power of Python to extract tables from HTML files. Whether you’re a data scientist seeking to automate data collection, a web developer aiming to streamline content extraction, or simply curious about the inner workings of web scraping, this guide will equip you with the knowledge and tools to efficiently extract tabular data from these documents. Join us on this journey as we unravel the secrets of this table extraction and empower you to unleash the data hidden within web pages.

The Significance of Extracting Tables from Digital Content

In the vast digital realm, tables represent a structured and organized format for data representation. Quite often, while browsing websites or working with HTML files, one may come across valuable tabular data that needs to be preserved or analyzed. Think of those times when you spotted an essential data set on a website, and wished to analyze it for a project or integrate it into a report.

However, the traditional methods of manually copying and pasting can become quite cumbersome. This process not only consumes an unnecessary amount of time but also poses challenges in retaining the original structure and formatting of the table.

Leveraging Python for Tabular Data Extraction

Fortunately, with the evolution of programming, automating such tasks has become possible. Python, one of the most versatile programming languages, offers a suite of libraries that can efficiently handle, extract, and process data from web pages and HTML files.

Some of the notable benefits of using Python for this purpose include:

  • Automation: Say goodbye to tedious manual extraction;
  • Accuracy: Ensures that data is captured precisely without missing values;
  • Preservation of Structure: Maintains the original layout and formatting of tables;
  • Versatility: Easily integrates with other tools and platforms for further analysis or visualization.

Essential Python Libraries for Data Extraction

For those keen on exploring this avenue, three primary Python libraries come to the rescue:

  • pandas: Renowned for data manipulation and analysis, pandas makes working with structured data a breeze;
  • html5lib: A pure-Python library for parsing HTML, it’s ideal for working with web content;
  • lxml: A library that provides a way to parse XML and HTML documents swiftly and efficiently.

Getting Started: Installing the Necessary Libraries

If the journey of data extraction intrigues you, begin by setting up the required environment. For those working on a Windows platform, the Command Prompt serves as the starting point.

Follow these steps to install the necessary libraries:

Launch the Command Prompt on your Windows system.

To install pandas, input the following command:

pip install pandas

For html5lib, use:

pip install html5lib

Lastly, to equip your environment with lxml, execute:

pip install lxml

Tips for a Smooth Installation Process:

  • Ensure you have the latest version of pip installed;
  • If facing issues, consider using a virtual environment for a clean setup;
  • Always refer to the official documentation of each library for any troubleshooting or advanced installation options.

Acquiring a Sample HTML File with Table Data

With all necessary tools in place, it’s time to take a closer look at an HTML file that houses table data. For this illustration, we will utilize an example, which was constructed in a past lesson. Named as gradesDownload, this file encloses the subsequent code:

<table border="1" class="dataframe">
  <!--... rest of the table structure ...-->
</table>

When this file is launched in a standard web browser, viewers will observe data neatly presented in a tabular layout. It’s advisable to ensure the file is stored within the identical directory as your Python script for seamless access.

It’s worth noting that this file exclusively comprises code for this single table. Nevertheless, the techniques and scripts showcased in the ensuing sections are versatile and can seamlessly handle documents brimming with multiple tables and diverse elements.

Extracting Table Content 

The succeeding segment elucidates the method to harness Python in extracting table contents from an HTML file.

To kick things off, there’s a need to incorporate the indispensable library and stipulate the location of this:

# Incorporate the necessary library
import pandas as pd

# Specify the HTML file's location
html_path = 'grades.html'

Subsequently, the read_html() function from the Pandas library is summoned to retrieve tables from the file. These tables are then stored in a collection of DataFrames. For clarity, every table will be displayed with a gap in between:

# Extract tables from the specified HTML file
tables = pd.read_html(html_path)

# Showcase each table, punctuated by blank lines for clarity
for table in tables:
    print(table, '\n\n')

When the above Python script is executed, the output should resemble:

   Unnamed: 0  student_id first_name last_name  grade

0           0        1221       John     Smith     86

1           1        1222      Erica     Write     92

2           2        1223       Will  Erickson     74

3           3        1224   Madeline      Berg     82

4           4        1225       Mike     Ellis     79

In scenarios where the extracted tables necessitate saving for future reference or processing, there’s the flexibility to export them into a CSV format. A point of importance is that if the source HTML is populated with several tables, the script will capture and display each one.

Webpage Table Extraction with Python

Extracting data tables from websites can be a vital step for data analysis. Python offers tools that make this process straightforward and efficient. Here’s a comprehensive guide to extracting tables from a webpage using Python:

1. Setting the Stage

To begin, ensure that the essential libraries are imported and specify the webpage’s URL from which the tables are to be extracted.

# Import the necessary libraries
import pandas as pd
from urllib.request import Request, urlopen

# Specify the webpage URL
url = 'https://pyshark.com/jaccard-similarity-and-jaccard-distance-in-python/'

In this scenario, the focus is on extracting tables from an article related to the Jaccard similarity and Jaccard distance concepts in Python. This article interestingly contains three distinct tables for extraction.

2. Fetching the Webpage Content

To retrieve the content from the specified webpage, utilize the urllib library:

# Construct a request object
request = Request(url)

# Incorporate headers to the request for compatibility
request.add_header('user-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36')

# Open and read the webpage content
page = urlopen(request)
html_content = page.read()

3. Table Extraction from the HTML Content

With the help of the pandas library, it’s possible to read and process tables from the HTML content:

# Extract tables into a list of DataFrames
tables = pd.read_html(html_content)

# Display the extracted tables
for table in tables:
    print(table, '\n\n')

When the code runs successfully, the expected output will be the tables present in the webpage, which might look similar to:

    0      1       2     3     4       5      6

0  NaN  Apple  Tomato  Eggs  Milk  Coffee  Sugar

4. Conversion of HTML Tables to CSV Format

After extracting the tables, they can be further processed or stored for future use. One common storage format is CSV (Comma-Separated Values).

Process of extract tables from html and webpages using python

To save the tables in CSV format, modify the earlier loop:

# Save the extracted tables as CSV files
for i, table in enumerate(tables, start=1):
    file_name = f'table_{i}.csv'
    table.to_csv(file_name)

Upon execution, individual CSV files corresponding to each extracted table will be generated in the directory containing the Python script.

In conclusion, Python offers an efficient and streamlined method to extract and store tables from webpages, making it a powerful tool for data analysts and enthusiasts alike.

Conclusion

In this informative piece, we delve into the art of table extraction from HTML documents and webpages, employing the dynamic trio of Python, pandas, and urllib. Should you find yourself in need of further clarification or wish to contribute your insights through constructive comments, please do not hesitate to engage with us below. Your questions and suggestions are most welcome.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

The post Unlocking Tables in HTML: Retrieving Tabular Data appeared first on Celery-Q.

]]>
https://celeryq.org/extract-table-from-html/feed/ 0
Creating a Python-Based FTP Server from Scratch https://celeryq.org/ftp-server-python/ https://celeryq.org/ftp-server-python/#respond Thu, 21 Sep 2023 06:03:11 +0000 https://celeryq.org/?p=297 In today’s digital landscape, efficient data transfer and management are paramount, and the File Transfer Protocol (FTP) remains a cornerstone in achieving this goal. Whether you need to upload files to a web server, download data from a remote location, or automate routine file transfers, Python provides a robust set of tools to streamline your […]

The post Creating a Python-Based FTP Server from Scratch appeared first on Celery-Q.

]]>
0 0
Read Time:11 Minute, 50 Second

In today’s digital landscape, efficient data transfer and management are paramount, and the File Transfer Protocol (FTP) remains a cornerstone in achieving this goal. Whether you need to upload files to a web server, download data from a remote location, or automate routine file transfers, Python provides a robust set of tools to streamline your server interactions.

Welcome to our comprehensive tutorial, where we will embark on a journey to demystify the intricacies of working with FTP servers using Python. Throughout this guide, we will unravel the fundamentals of it, delve into powerful libraries and modules for operations, and walk you through practical examples that showcase real-world scenarios.

File Transfer Protocol is a robust protocol designed to facilitate the movement of files within a digital network. In essence, it serves as a bridge, enabling the transfer of files between computers, irrespective of their location.

Essential Characteristics:

  • Standardized Protocol: FTP follows a specific set of guidelines ensuring compatibility across diverse systems;
  • Bidirectional Transfer: It enables both uploads and downloads, making data exchange seamless;
  • Secure Variants: Protocols like Secure File Transfer Protocol offer encrypted, secure file transfers.

FTP in the Corporate Landscape:

It isn’t just a relic of the internet’s early days. It’s a tool that businesses, particularly large-scale enterprises, lean on to exchange voluminous files securely.

Why Companies Rely on It:

  • Efficient Large File Transfer: Bypasses email attachment limitations;
  • Enhanced Security: With the right configurations, FTP can secure sensitive data during transit;
  • Cross-Platform Compatibility: Allows file transfers across different operating systems.

File Transfer Protocol for the Tech-Savvy:

If you’re a programmer or a developer, there’s a good chance you’ve had to share outputs – from model predictions to test results. With FTP, these digital assets can be dispatched and accessed across various business segments without hassle.

Tips for Programmers :

  • File Naming Conventions: Adopt a clear and consistent naming pattern to make files easily recognizable;
  • Folder Structuring: Organize files in a hierarchical manner for easy navigation and retrieval;
  • Backup Regularly: Always maintain a local copy of the files before initiating transfers.

Getting Hands-On: Testing with DLP Test Server:

For those eager to experiment with FTP without any commitments, the DLP Test server offers a sandbox environment. This public test server lets users get a feel of the functionalities, but with a safety net – it automatically purges files after a short duration.

Advantages of Using DLP Test Server:

  • Cost-Effective: A free platform to test and understand FTP operations;
  • Data Safety: With automatic file deletion, there’s no lingering data, ensuring user privacy;
  • Ideal for Beginners: A risk-free environment for those new to it to learn the ropes.

Establishing a Connection to an FTP Server with Python

Transferring files and managing content on remote servers is made easy with the File Transfer Protocol (FTP). Python, known for its versatility, has a built-in library called ftplib that enables seamless interaction with FTP servers.

Process of simple ftp server in python

1. Installing Necessary Libraries

Before connecting to an FTP server, it’s vital to ensure all required libraries are in place. For this task, Python’s native library, ftplib, is required. If not already present, simply import it:

import ftplib

2. Gather Essential Credentials

To establish a connection to any FTP server, a set of credentials is required. These typically include:

  • Host: The server’s address;
  • Username: Authorized username to access the server;
  • Password: Password corresponding to the username for authentication.

For the purpose of demonstration, let’s consider a hypothetical test server. The credentials for this server might be listed on a documentation page or provided by the administrator.

3. Storing Credentials in Python

It’s a good practice to define your credentials as constant variables in your script for better readability:

# Defining FTP server credentials
FTP_HOST = "ftp.dlptest.com"
FTP_USER = "dlpuser"
FTP_PASS = "rNrKYTX9g7z3RgJRmxWuGHbeu"

4. Initiating Connection to the FTP Server

With the credentials in place, the next step is to leverage ftplib to establish a connection:

# Initiating a connection to the FTP server
ftp = ftplib.FTP(FTP_HOST, FTP_USER, FTP_PASS)

5. Verifying the Connection

Once the connection is set up, it’s advisable to validate it. A quick method is to request a welcome message from the connected FTP server. Most servers typically return a welcome message, though it’s not a universally followed norm.

# Retrieving and printing the server's welcome message
print(ftp.getwelcome())

For the aforementioned test server, this would ideally return:

'220 Welcome to the DLP Test FTP Server'

Remember, not all FTP servers might provide a welcome message. Therefore, the absence of a message isn’t always indicative of a failed connection. In our example, the DLP Test FTP Server is known to give a welcome message upon successful connection.

Extracting a List of Files from an FTP Server Using Python

FTP (File Transfer Protocol) servers are often used to store files and directories that can be accessed and manipulated remotely. Python, with its extensive libraries and modules, offers straightforward ways to interact with FTP servers, making tasks such as listing files in a directory easy and efficient.

Understanding the Current Directory:

Before diving into listing files, it’s beneficial to understand the current directory you are in on the FTP server. This context can be ascertained using the following method:

# Retrieve the current directory path
current_directory = ftp.pwd()
print(current_directory)

Executing this should display the root directory represented as /, indicating the starting point within the FTP server.

Listing Files in the Directory:

Once you’re familiar with your location in the server’s structure, the next step is listing the files. This can be done in a couple of different ways:

Displaying a Detailed Directory Listing:

By invoking this method, the system presents a detailed view of files, complete with permissions, ownership, size, and modification date.

# Display a detailed directory listing
ftp.dir()

On execution, this would show an output similar to:

-rw-r–r–    1 1001     1001          958 Aug 15 17:30 file1.txt

-rw-r–r–    1 1001     1001          958 Aug 15 17:30 file2.txt

-rw-r–r–    1 1001     1001            0 Aug 15 17:31 file3.txt

Fetching a Simple List of File Names:

For those interested in just the file names without the accompanying details, Python provides another method.

# Fetch a list of files in the directory
file_list = ftp.nlst()
print(file_list)

When run, the script will generate a Python list showcasing the files:

['file1.txt', 'file2.txt', 'file3.txt']

Creating a New Directory in Your FTP Server 

Before diving into the exciting world of uploading files to your server, let’s begin by establishing our own dedicated folder to organize our testing materials. This step will ensure a seamless and structured approach to managing your activities. Since this server serves the purpose of numerous developers testing their code, it’s crucial to maintain order and clarity.

Follow these steps to create a subdirectory called ‘pyshark’:

Initialize the FTP Connection: First, establish a connection to your server in Python. You can use the ftplib library, which provides essential  functionalities.

# Import the ftplib library
from ftplib import FTP

# Establish an FTP connection (replace 'ftp.example.com' with your FTP server address)
ftp = FTP('ftp.example.com')

# Log in with your credentials (replace 'username' and 'password' accordingly)
ftp.login(user='username', passwd='password')

Create the New Directory: Once connected, create a new folder using the mkd() method, and name it ‘pyshark.’

# Create a new folder in the directory
ftp.mkd('pyshark')

Verification: 

To confirm the successful creation of the ‘pyshark’ folder, you have two options:

Use an client like FileZilla to connect to your server and check for the ‘pyshark’ directory.

Alternatively, print a list of file names in the directory to ensure ‘pyshark’ is listed.

# Print a list of files in the directory
print(ftp.nlst())

Setting the New Directory as the Current Directory on the Server

Now that we have ‘pyshark’ set up, let’s make it the current working directory on the server for our operations.

Follow these steps:

Change the Current Directory: Use the cwd() method to set ‘pyshark’ as the current directory.

# Set the new folder as the current directory
ftp.cwd('pyshark')

Validation: Confirm the directory change by retrieving the name of the current directory and printing it.

# Get current directory path
print(ftp.pwd())

You should see the output ‘/pyshark,’ indicating that ‘pyshark’ is now your current working directory on the server.

Preparing a Sample File for Upload

Create the Sample File: Open a text editor or Python script and create a new file called ‘file1.txt.’ Add the text “This is a sample file” to it and save it.

Uploading Files to Your Server

Now that we have the ‘pyshark’ directory ready and a sample file prepared, let’s explore how to upload this file to the FTP server.

Follow these steps to upload the sample file:

# Upload file to FTP server
with open('file1.txt', 'rb') as f:
    ftp.storbinary('STOR ' + 'uploaded_file.txt', f)

Verification: To confirm the successful upload, generate a list of files in the current directory using nlst().

# Produce a list of files in the directory
print(ftp.nlst())

You should observe the output [‘uploaded_file.txt’], indicating that your file ‘file1.txt’ has been successfully uploaded to the server under the name ‘uploaded_file.txt.’

Retrieving Files from an FTP Server

Understanding how to efficiently fetch files from a File Transfer Protocol server is crucial when dealing with remote file operations. These servers have been a long-standing method to store, retrieve, and manage files over a network. This section delves deep into the process of downloading a specific file from such a server.

For the sake of context, consider a hypothetical scenario where there’s an existing directory named “pyshark”. This will be the directory under focus for this tutorial.

The Core Method: .retrbinary()

The FTP class in Python comes with a method named .retrbinary(). As the name suggests, this method facilitates the retrieval of a file in binary format from the server. The command RETR is the underlying  command utilized by this method. Using it ensures that even binary files like images or executables are correctly downloaded without any corruption.

Let’s assume a scenario wherein a file was previously uploaded to this server. The goal now is to download this very file back to the local system.

Step-by-Step Implementation:

Preparation of Destination File:

Before initiating the download process, there should be a file on the local system to store the incoming data. To prepare for this, create a binary file in ‘write’ mode. In this example, this file is named “downloaded_file.txt”.

Downloading the File:

Use the .retrbinary() method to initiate the download process. The target file on the server is “uploaded_file.txt”.

# Retrieving a file from the FTP server
with open('downloaded_file.txt', 'wb') as f:
    ftp.retrbinary('RETR ' + 'uploaded_file.txt', f.write)

Post-Download Check:

Once the code is executed, it’s essential to verify the download. Check the “pyshark” directory, and the “downloaded_file.txt” should be present. This file will contain the data fetched from the FTP server.

Renaming Files: A Comprehensive Guide

Current Working Directory: Before we proceed, ensure that you are working in the correct directory on your FTP server. In this example, we’ll be working within the “pyshark” folder.

Process of simple ftp server in python

Using the .rename() Method: Renaming a file is remarkably straightforward. You can accomplish this task using the .rename() method of the class. This method takes two parameters: the original file name and the new file name.

Here’s an example of how to rename “uploaded_file.txt” to “renamed_file.txt”:

# Rename file in FTP server
ftp.rename('uploaded_file.txt', 'renamed_file.txt')

Verifying the Rename: After renaming the file, it’s always a good practice to confirm that the operation was successful. You can do this by listing the files in the directory. The .nlst() method helps us achieve this.

# Produce a list of files in the directory
print(ftp.nlst())

If the renaming was successful, you should see:

renamed_file.txt in the list

Tips and Insights:

  • Always double-check the original and new file names to avoid errors during the renaming process;
  • Renaming files can be particularly useful for maintaining an organized and structured file system;
  • Make sure you have the necessary permissions to rename files, as this might be restricted in certain environments.

Deleting Files: Step by Step

Current Working Directory: As before, ensure that you are in the “pyshark” folder.

Using the .delete() Method: To delete a file from the server, you can employ the .delete() method of the class. This method requires just one parameter – the file name you want to remove.

Here’s an example of how to delete the file “uploaded_file.txt”:

# Delete file from FTP server
ftp.delete('uploaded_file.txt')

Verifying the Deletion: After deleting the file, it’s essential to confirm that it was removed successfully. You can verify this by listing the files in the directory using the .nlst() method.

# Produce a list of files in the directory
print(ftp.nlst())

If the deletion was successful, you should see an empty list.

Tips and Recommendations:

  • Always be cautious when deleting files, as this action is irreversible;
  • Regularly review and clean up your server to free up storage space and maintain an organized file structure;
  • It’s a good practice to perform a final check to ensure you’re deleting the correct file before executing the deletion command.

By following these steps and best practices, you’ll be adept at renaming and deleting files, ensuring smooth file management for your projects and data.

Conclusion

In this article, we delved into the intricacies of harnessing the power of Python for interacting with an FTP server. Our journey encompassed a spectrum of actions, from the seamless transfer of files through uploading and downloading, to the precise eradication of files residing within the directory. We also navigated through a treasure trove of valuable functions offered by the ftplib library.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

The post Creating a Python-Based FTP Server from Scratch appeared first on Celery-Q.

]]>
https://celeryq.org/ftp-server-python/feed/ 0
Performing Matrix Subtraction in Python https://celeryq.org/matrix-subtraction-python/ https://celeryq.org/matrix-subtraction-python/#respond Wed, 20 Sep 2023 14:55:16 +0000 https://celeryq.org/?p=293 Matrices are fundamental mathematical constructs with widespread applications in fields ranging from computer science and physics to engineering and data analysis. They serve as powerful tools for organizing and manipulating data, making them indispensable in various computational tasks. In this article, we will delve into the world of matrix operations, focusing specifically on matrix addition […]

The post Performing Matrix Subtraction in Python appeared first on Celery-Q.

]]>
0 0
Read Time:5 Minute, 39 Second

Matrices are fundamental mathematical constructs with widespread applications in fields ranging from computer science and physics to engineering and data analysis. They serve as powerful tools for organizing and manipulating data, making them indispensable in various computational tasks. In this article, we will delve into the world of matrix operations, focusing specifically on matrix addition and subtraction, two fundamental operations that form the backbone of linear algebra and numerical computing.

Our journey will take us through the essential steps and intuitive concepts behind matrix addition and subtraction, with a particular emphasis on their practical implementation in Python. Whether you’re a seasoned programmer or just starting your journey in the realm of mathematics and coding, this comprehensive guide will equip you with the knowledge and skills needed to perform these operations confidently.

Matrix Subtraction: A Comprehensive Guide

Matrix subtraction might seem like a daunting task for those new to the concept, but with the right intuition and understanding, it becomes a straightforward process. This guide delves deep into the mechanics of it, providing simple examples and also demonstrating how Python, a popular programming language, simplifies the task even further.

A Brief Overview

  • Core Concept: At its essence, matrix subtraction is quite similar to matrix addition. It involves taking two matrices of the same dimension and subtracting the elements of one matrix from the corresponding elements of the other matrix;
  • Simple Examples: The examples provided in this guide are designed to be straightforward, ensuring that readers can grasp the fundamental principles without the need for complex calculations or tools.

Advancing Beyond the Basics

Even though this guide starts with basic examples, the techniques discussed here can be effectively applied to more intricate matrix subtraction problems. By building a strong foundation, readers can confidently tackle even the most challenging matrix operations.

Python and Matrix Operations: A Perfect Duo

Python, a versatile programming language, has several libraries that make matrix operations, including subtraction, a breeze. One of the most prominent libraries for this purpose is numpy.

Benefits of using it:

  • Efficiency: Matrix operations can be performed with just a few lines of code;
  • Flexibility: Python’s libraries can handle matrices of varying sizes and complexities;
  • Visualization: With the right tools, it can also be used to visualize matrices, aiding in better understanding and interpretation.

Setting Up Your Python Environment 

To ensure a smooth experience, it’s essential to have the numpy library installed. This section guides readers through the installation process.

Steps to Install numpy:

  • Windows Users: Open the “Command Prompt.”;
  • Type in the following command and press enter:

pip install numpy

Wait for the installation process to complete.

Tips for a Successful Installation:

  • Ensure your Python and pip versions are up-to-date;
  • If you encounter any errors, consult Python’s official documentation or forums for troubleshooting.

Understanding Matrix Subtraction

Matrix subtraction is a straightforward operation applied to matrices. However, for this subtraction to occur, there’s a crucial criterion to be met: both matrices must be of identical dimensions. This means that if one matrix is 2×2, the other should also be 2×2; for a 3×3 matrix, its counterpart should also be 3×3, and so forth.

A Practical Illustration

Imagine a scenario where two farmers each possess specific quantities of apples and grapes in their storage facilities. Visualizing this situation can be achieved by using a table:

ItemsFarmer 1Farmer 2
Apples32
Grapes75

This table’s representation in the realm of matrices would be:

A =3275

Suppose both farmers decide to sell some of their stock at a market. The number of fruits they sold can be similarly tabulated:

ItemsFarmer 1Farmer 2
Apples21
Grapes13

This data in matrix form becomes:

B = 2113

After the day’s sales, the farmers decide to take stock of their remaining inventory. The simplest way to find out the residual stock is by subtracting the quantities of the sold fruits from the initial stock.

Resulting inventory:

ItemsFarmer 1Farmer 2
Apples11
Grapes62

Performing this subtraction with matrices, we get:

C = A – B =

3275 -2113

= 1162

This result C matches perfectly with the tabulated inventory data.

Generalization to Larger Matrices

Matrix subtraction isn’t confined to 2×2 matrices. For any m×n matrix, the subtraction operates in a cell-by-cell manner. Represented mathematically:

If A = a(ij) and B =b(ij) where i = 1,…,m and j = 1,…,n, then the result of the subtraction, C = A – B, will have its elements c_(ij) such that: c_(ij) = a_(ij) – b_(ij) for every i and j.

In simpler terms, each element in matrix A is subtracted from the corresponding element in matrix B to yield the resultant matrix C.

Matrix Subtraction Using Python and Numpy

Matrix operations are a core concept in linear algebra and have extensive applications in various domains, ranging from data analysis to computer graphics. Python, being a versatile programming language, offers an array of tools to manipulate and operate on matrices. One of the most powerful libraries for this purpose is NumPy.

Integrating NumPy for Matrix Operations

To get started with matrix operations in Python, it’s crucial to have the NumPy library, which provides numerous functionalities for matrix arithmetic. Begin by integrating this library into the script:

import numpy as np

NumPy not only makes matrix operations intuitive but also optimizes them for better performance.

Constructing Matrices in Python

In Python, matrices can be visualized as multi-dimensional arrays. When it comes to matrix subtraction, it’s paramount that the matrices have identical dimensions. This ensures that each element in one matrix has a corresponding element in the other matrix to pair with during the subtraction process.

Subtraction of matrices using python programming

For this demonstration, consider two 2×2 matrices:

A = np.array([[3, 2],
              [7, 5]])

B = np.array([[2, 1],
              [1, 3]])

These matrices consist of rows and columns of numbers that can be manipulated using NumPy’s functions.

Performing Matrix Subtraction

Matrix subtraction in NumPy is straightforward, thanks to the subtract() function. This function processes the matrices element-wise, subtracting corresponding elements from each matrix.

Execute the subtraction using the following code:

C = np.subtract(A, B)

print(C)

Upon executing the above, the output will be:

[[1 1]
 [6 2]]

This result matches what would be obtained by manual calculations. With NumPy, the complexities of matrix arithmetic get abstracted, enabling users to focus on higher-level operations and applications.

Conclusion

In this comprehensive article, we delved into the intricacies of matrix subtraction, unraveling its intuitive essence and elucidating the meticulous steps involved. Furthermore, we exemplified these concepts through a series of fully-realized illustrative instances, skillfully implemented with the Python programming language.

Should you harbor inquiries or harbor a penchant for proffering constructive edits, we cordially invite you to share your thoughts and insights in the comments section below.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

The post Performing Matrix Subtraction in Python appeared first on Celery-Q.

]]>
https://celeryq.org/matrix-subtraction-python/feed/ 0
Python: Transforming JSON into DataFrames Made Easy https://celeryq.org/json-to-dataframe-python/ https://celeryq.org/json-to-dataframe-python/#respond Wed, 20 Sep 2023 14:50:36 +0000 https://celeryq.org/?p=289 In today’s data-driven world, handling data in various formats is an essential skill for every data professional and Python enthusiast. JSON (JavaScript Object Notation) has become a ubiquitous data interchange format due to its simplicity and flexibility. On the other hand, Pandas, the popular data manipulation library in Python, provides powerful tools for data analysis […]

The post Python: Transforming JSON into DataFrames Made Easy appeared first on Celery-Q.

]]>
0 0
Read Time:6 Minute, 41 Second

In today’s data-driven world, handling data in various formats is an essential skill for every data professional and Python enthusiast. JSON (JavaScript Object Notation) has become a ubiquitous data interchange format due to its simplicity and flexibility. On the other hand, Pandas, the popular data manipulation library in Python, provides powerful tools for data analysis and manipulation. When you combine the versatility of JSON with the data-wrangling capabilities of Pandas, you open up a world of possibilities for working with structured data.

In this article, we will delve into the process of converting it step by step. Whether you are extracting data from web APIs, reading files, or working with data received in JSON format, understanding how to seamlessly transform it into a Pandas DataFrame is a valuable skill. We will explore various methods and techniques, from reading JSON files to handling nested structures and dealing with different data complexities.

So, if you’re ready to unlock the potential of your data and harness the analytical power of Pandas, let’s dive into the intricacies of converting.

Unlocking the Power of Data

In the ever-evolving realm of data science, every meaningful journey commences with a crucial step: accessing and interpreting data accurately. This pivotal process sets the stage for the exploration, analysis, and generation of insights. In this guide, we’re going to unravel the art of transforming JSON data, a versatile and widely-used data format, into the Pandas DataFrame within the Python ecosystem. Buckle up as we embark on a journey that promises to equip you with the essential skills to effortlessly convert making your data manipulation endeavors a breeze.

1. The Significance of JSON in Data Science

Before delving into the conversion process, let’s explore why it has become the go-to choice for data scientists in various projects. It offers several compelling advantages:

  • Data Diversity: JSON accommodates diverse data types, including strings, numbers, objects, arrays, and more. This flexibility makes it ideal for handling complex datasets;
  • Interoperability: widely supported by programming languages, making it easy to exchange data between systems regardless of their technological stack;
  • Human-Readable: Its structure is human-readable, aiding in the debugging process and fostering collaboration among team members;
  • API Friendliness: preferred format for many APIs, enabling seamless data retrieval and integration from a plethora of sources.

2. The Python-Pandas Synergy

Python, renowned for its simplicity and versatility, is the preferred language for data manipulation. Within the Python ecosystem, Pandas shines as the go-to library for data analysis and manipulation. Before we dive into the conversion process, let’s understand why Python and Pandas make an unbeatable duo:

  • Python’s Elegance: its clean syntax and readability are a boon for data scientists. It simplifies code development and debugging, making it the ideal choice for data-centric tasks;
  • Pandas’ Power: Pandas is a game-changer when it comes to data manipulation. It introduces data structures like DataFrames, which are akin to tables in databases, making it seamless to work with structured data.

3. Preparing Your Python Environment

Before we start converting JSON to a Pandas DataFrame, you need to ensure your Python environment is equipped with the necessary tools. Here are the steps:

a. Installing 

If Pandas isn’t already installed, follow these steps to get it up and running in your environment:

  • Open your Command Prompt (Windows) or Terminal (macOS/Linux);
  • Run the following command:
pip install pandas

b. Verifying Version

It’s crucial to have Pandas version 1.0.3 or higher for this conversion process. To check your version, follow these steps:

In your Python environment, import Pandas:

import pandas as pd

Then, print the Pandas version:

print(pd.__version__)

c. Updating Pandas (if necessary)

If your version falls short of 1.0.3, don’t worry; it’s easy to upgrade. Use the following command in your Command Prompt or Terminal:

pip install --upgrade pandas

Crafting Sample JSON Files for Data Analysis

To illustrate how to work with structures and their conversion this guide will walk readers through the creation of two distinct JSON files. These files, once generated, can serve as a foundation for further exploration in the realms of data analysis, processing, or migration.

1. Basic Structure

The first file we’ll delve into is characterized by its simplicity. This structure contains fundamental user information without nested elements. Here’s how this layout appears:

Process of convert json to dataframe in python
[
    {
        "userId": 1,
        "firstName": "Jake",
        "lastName": "Taylor",
        "phoneNumber": "123456",
        "emailAddress": "john.smith@example.com"
    },
    {
        "userId": 2,
        "firstName": "Brandon",
        "lastName": "Glover",
        "phoneNumber": "123456",
        "emailAddress": "brandon.glover@example.com"
    }
]

To keep everything organized and accessible, it’s recommended to save this file under the name sample.json, ideally situated in the same directory where the associated Python scripts reside.

2. Advanced JSON Structure with Nested Elements

For users seeking a more complex data structure, the second example incorporates nested elements, providing a richer context. In addition to the basic user details, this file also presents an embedded structure to store the courses associated with each user. Here’s an illustrative example of this intricate structure:

[
    {
        "userId": 1,
        "firstName": "Jake",
        "lastName": "Taylor",
        "phoneNumber": "123456",
        "emailAddress": "john.smith@example.com",
        "courses": {
            "course1": "mathematics",
            "course2": "physics",
            "course3": "engineering"
        }
    },
    {
        "userId": 2,
        "firstName": "Brandon",
        "lastName": "Glover",
        "phoneNumber": "123456",
        "emailAddress": "brandon.glover@example.com",
        "courses": {
            "course1": "english",
            "course2": "french",
            "course3": "sociology"
        }
    }
]

For ease of access and organization, it’s advised to save this file as nested_sample.json. Again, storing it in the same directory as the relevant Python scripts ensures seamless integration during subsequent operations.

In sum, these two sample JSON files, both basic and nested, offer a practical starting point for anyone looking to understand and harness the power of it for data-centric tasks. Whether used for educational purposes or as a foundational step in a broader analytical project, these samples provide a tangible insight into the diverse capabilities of it.

Turning Basic JSON into a DataFrame with Pandas in Python

Thankfully, for those acquainted with Python and Pandas, integrating JSON data into their workflow is straightforward. This powerful data manipulation and analysis library, includes a function called .read_json(). This method allows developers to seamlessly read this file and transform it into a DataFrame, which is essentially a table or a two-dimensional array-like structure.

Here’s a demonstration of this process:

import pandas as pd

# Reading the JSON file into a DataFrame
df = pd.read_json("sample.json")

# Displaying the DataFrame
print(df)

When executed, this code fetches the content from sample.json and projects it as a DataFrame, making it easier to manipulate and analyze.

Translating Nested 

Sometimes, JSON data structures can be a bit more complex, containing nested elements and deeper hierarchies. For instance, when contrasting nested_sample.json with sample.json, one can discern a new field titled ‘courses’—an array encapsulating multiple values.

For such intricacies, a regular .read_json() might fall short. Instead, there’s the .json_normalize() function, crafted precisely for these scenarios. It tackles semi-structured JSON data and simplifies it into a flatter, table-like structure, ensuring consistency and ease of access.

Here’s a glimpse of how to employ this function:

import pandas as pd
import json

# Opening and reading the nested JSON file
with open('nested_sample.json', 'r') as f:
    nested_data = json.loads(f.read())

# Converting the nested JSON to a DataFrame
df_nested = pd.json_normalize(nested_data)

# Showcasing the DataFrame
print(df_nested)

By using the above snippet, the structured content within nested_sample.json is rendered into a Pandas DataFrame. This transformation allows users to effectively navigate and manipulate data, even when originally presented in a deeply nested format.

Conclusion

In this informative piece, we have delved into the art of transforming JSON data into a Pandas DataFrame within the Python programming realm. This endeavor involves the adept utilization of both the ‘json’ and ‘pandas’ libraries, facilitating a seamless conversion process.

Should you find yourself curious or inclined to offer valuable insights or refinements, we warmly invite you to share your thoughts in the comments section below.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

The post Python: Transforming JSON into DataFrames Made Easy appeared first on Celery-Q.

]]>
https://celeryq.org/json-to-dataframe-python/feed/ 0