In Python 2, print
is a statement, while in Python 3, it is a function.
Python 2 Example:
# Python 2
print "Hello, World!"
Python 3 Example:
# Python 3
print("Hello, World!")
In Python 2, when you divide two integers, the result is also an integer, truncating the decimal part. In Python 3, the division operator /
performs true division, returning a float.
Python 2 Example:
# Python 2
result = 5 / 2
print result # Output: 2
Python 3 Example:
# Python 3
result = 5 / 2
print(result) # Output: 2.5
Python 2 has two types of strings: ASCII strings (str
) and Unicode strings (unicode
). Python 3 uses Unicode strings by default, and the str
type represents Unicode strings.
Python 2 Example:
# Python 2
ascii_string = 'Hello'
unicode_string = u'你好'
Python 3 Example:
# Python 3
unicode_string = '你好'
The syntax for exception handling has a minor difference. In Python 2, you can use the as
keyword optionally, while in Python 3, it is mandatory.
Python 2 Example:
# Python 2
try:
result = 1 / 0
except ZeroDivisionError, e:
print "Error:", e
Python 3 Example:
# Python 3
try:
result = 1 / 0
except ZeroDivisionError as e:
print("Error:", e)
Many modern data science libraries, such as pandas
, NumPy
, and scikit - learn
, have fully embraced Python 3 and may have limited or no support for Python 2. For example, Python 2 support for pandas
was officially dropped after version 0.24.
Python 3 introduced some new syntax features that are not available in Python 2. For example, the yield from
syntax for generators.
# Python 3
def gen1():
yield 1
yield 2
def gen2():
yield from gen1()
yield 3
for num in gen2():
print(num)
In both Python 2 and Python 3, pandas
is a popular library for data manipulation. However, with the deprecation of Python 2 support in pandas
, Python 3 is the recommended choice for new projects.
# Python 3
import pandas as pd
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
print(df)
Matplotlib
is a widely used library for data visualization. It supports both Python 2 and Python 3, but Python 3 may offer better performance and compatibility with other modern libraries.
# Python 3
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 100)
y = np.sin(x)
plt.plot(x, y)
plt.show()
scikit - learn
is a powerful machine learning library. Python 3 is the preferred version as it ensures compatibility with the latest features and bug fixes.
# Python 3
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3)
knn = KNeighborsClassifier()
knn.fit(X_train, y_train)
accuracy = knn.score(X_test, y_test)
print("Accuracy:", accuracy)
The differences between Python 2 and Python 3 in data science are significant and can impact various aspects of your projects. While Python 2 was once the standard, Python 3 has become the new norm due to its improved features, better Unicode support, and wider library compatibility. Data scientists should embrace Python 3 for new projects and migrate existing Python 2 code to stay up - to - date with the latest developments in the field.