Authors

  • Qurbonov Behruz Amrulloyevich
  • Yondoshaliyev Alisher Elyorjon o‘g‘li

DOI:

https://doi.org/10.71337/inlibrary.uz.jnci.114217

Keywords:

Keywords: Analysis of IoT Big Data pandas scikit-learn TensorFlow Anomaly Detection Autoencoders PySpark.

Abstract

Abstract: The Internet of Things (IoT) has revolutionized data collection by enabling billions of interconnected devices to generate vast amounts of data, often referred to as big data. These devices, ranging from smart sensors in industrial systems to wearable health monitors, produce high-volume, high-velocity, and high-variety data that require advanced analytical techniques for meaningful insights. Artificial Intelligence (AI), with its capabilities in machine learning (ML), deep learning (DL), and predictive analytics, is uniquely suited to process and analyze IoT-generated big data. This article explores the fundamentals of AI-driven analysis of big data from IoT devices, addressing methods, challenges, solutions, and mathematical formulations to quantify performance and efficiency.


background image

JOURNAL OF NEW CENTURY INNOVATIONS

https://scientific-jl.com/new

Volume–79_Issue-2_June-2025

263

263

ARTIFICIAL INTELLIGENCE ANALYSIS OF BIG DATA

COLLECTED THROUGH IOT DEVICES

Qurbonov Behruz Amrulloyevich

Tashkent University of Information Technologies

named after Muhammad al-Khwarizmi 3rd year student

Faculty of Software Engineering

Recipient of the Muhammad al-Khwarizmi scholarship

Yondoshaliyev Alisher Elyorjon o‘g‘li

Tashkent University of Information Technologies

named after Muhammad al-Khwarizmi 2rd year student

Faculty of Software Engineering


Abstract:

The Internet of Things (IoT) has revolutionized data collection by

enabling billions of interconnected devices to generate vast amounts of data, often
referred to as big data. These devices, ranging from smart sensors in industrial systems
to wearable health monitors, produce high-volume, high-velocity, and high-variety
data that require advanced analytical techniques for meaningful insights. Artificial
Intelligence (AI), with its capabilities in machine learning (ML), deep learning (DL),
and predictive analytics, is uniquely suited to process and analyze IoT-generated big
data. This article explores the fundamentals of AI-driven analysis of big data from IoT
devices, addressing methods, challenges, solutions, and mathematical formulations to
quantify performance and efficiency.

Keywords:

Analysis of IoT Big Data pandas, scikit-learn, TensorFlow, Anomaly

Detection, Autoencoders , PySpark.

Data Ingestion and Preprocessing

IoT devices generate streaming data in formats like JSON, CSV, or time-series

logs. Preprocessing is critical to handle noise, missing values, and heterogeneity.

• Streaming Data Ingestion: Tools like Apache Kafka or paho-mqtt in Python

enable real-time data collection from IoT devices. The throughput of data ingestion
is:

where Θ is throughput, D is data volume, and T is processing time
Missing Value Imputation: Missing data is common in IoT due to connectivity

issues. Imputation uses mean or time-series interpolation:


background image

JOURNAL OF NEW CENTURY INNOVATIONS

https://scientific-jl.com/new

Volume–79_Issue-2_June-2025

264

264

where xˆ_t is the imputed value at time t, and x_i are neighboring values.
Normalization: IoT data often spans different scales. Min-max normalization is

used:

where x ′ is the normalized value, and x_min, x_max are the minimum and

maximum values.

Exploratory Data Analysis (EDA)

EDA identifies patterns in IoT data using visualization tools like matplotlib and

seaborn. Correlation analysis quantifies relationships:

where ρ is the Pearson correlation coefficient, Cov(X, Y ) is the covariance, and

σX, σY are standard deviations.

Anomaly Detection

Anomaly detection identifies unusual patterns in IoT data,

such as equipment failures or cyber threats. Unsupervised learning algorithms like
Isolation Forest or Autoencoders are effective.

• Isolation Forest: This algorithm isolates anomalies by randomly partitioning

data. The anomaly score is based on path length:

where s(x, n) is the anomaly score, E(h(x)) is the average path length, and c(n) is

the average path length for n samples.

• Autoencoders: Deep learning models reconstruct normal data, with high

reconstruction error indicating anomalies:

where L is the reconstruction loss, xi is the input, and xˆi is the reconstructed

output.

Predictive Modeling

Predictive models forecast future trends or events, such as equipment

maintenance needs. Time-series models like ARIMA or Long Short-Term Memory
(LSTM) networks are used.

ARIMA: Models time-series data:

where ϕ(B) and θ(B) are autoregressive and moving average polynomials, B is

the backshift operator, d is the differencing order, y_t is the time series, and ϵ_t is
white noise.

LSTM: Captures long-term dependencies in sequential data:


background image

JOURNAL OF NEW CENTURY INNOVATIONS

https://scientific-jl.com/new

Volume–79_Issue-2_June-2025

265

265

where ht is the hidden state, ot is the output gate, and Ct is the cell state.

Data Heterogeneity

IoT data varies in format, scale, and quality, complicating analysis.
– Problem: Heterogeneous data reduces model accuracy, quantified by variance:

where σ^2 is variance, xi are data points, and µ is the mean.
– Solution: Use data integration techniques, such as schema mapping in pandas,

and feature engineering to standardize data.

IoT data often includes sensitive information, raising privacy and security

concerns.

– Problem: Centralized data storage risks breaches, with privacy loss measured

by differential privacy:

where ϵ is the privacy budget, P(M|D) and P(M|D′ ) are model output

probabilities for datasets D and D′ .

– Solution: Implement federated learning, where models are trained locally:

where ∆W is the aggregated model update,

Li(W) is the gradient from device i,

and k is the number of devices. Use encryption for data transmission.

Key Algorithms for AI Analysis of IoT Data


background image

JOURNAL OF NEW CENTURY INNOVATIONS

https://scientific-jl.com/new

Volume–79_Issue-2_June-2025

266

266

AI-driven analysis of big data from IoT devices enables transformative

applications in smart cities, healthcare, and manufacturing. Methods like anomaly
detection, predictive modeling, and distributed computing, supported by Python
libraries, handle the complexity of IoT data. Challenges such as data volume,
heterogeneity, privacy, and computational complexity are mitigated through sampling,
federated learning, and GPU acceleration. Mathematical formulations and algorithms,
including Isolation Forest, Autoencoders, and SGD, provide a rigorous foundation for
these solutions. By leveraging AI and IoT, organizations can unlock actionable
insights, driving efficiency and innovation.


background image

JOURNAL OF NEW CENTURY INNOVATIONS

https://scientific-jl.com/new

Volume–79_Issue-2_June-2025

267

267

REFERENCES

1.

Atzori, L., Iera, A., & Morabito, G. (2010). The Internet of Things: A survey.

Computer Networks

, 54(15), 2787–2805.

2.

Zanella, A., et al. (2014). Smart cities: A literature review.

IEEE Internet of

Things Journal

, 1(1), 20–31.

3.

Al-Turjman, F. (2020). Artificial intelligence-enabled smart things in the internet
of things era.

IEEE Access

, 8, 98516–98525.

4.

Mao, Y., et al. (2017). Mobile edge computing: Survey and research outlook.

IEEE Communications Surveys & Tutorials

, 22(3), 1624–1657.

5.

Chen, M., Yang, Z., Saad, W., & Yin, C. (2020). A joint learning and
communications framework for federated learning over wireless networks.

IEEE

Transactions on Wireless Communications

, 19(10), 6576–6590.

6.

Xu, L. D., He, W., & Li, S. (2014). Internet of Things in industries: A survey.

IEEE Transactions on Engineering Management

, 61(4), 868–880.

7.

Zhang, Y., et al. (2018). Edge AI: On-demand accelerating deep neural network
inference via edge computing.

IEEE Transactions on Mobile Computing

, 21(5),

1467–1480.

8.

Ning, Z., et al. (2020). Intelligent resource scheduling in vehicular fog networks
with reinforcement learning.

IEEE Transactions on Vehicular Technology

, 69(9),

9915–9926.

9.

Bahl, P., Han, R., Li, F., & Satyanarayanan, M. (2002). Challenges in developing
context-aware computing applications.

Wireless Communications, IEEE

, 9(5),

34–42.

10.

IBM Research. (2021).

AI and IoT Convergence: Smarter Decisions at the Edge

. IBM White Paper.

References

Atzori, L., Iera, A., & Morabito, G. (2010). The Internet of Things: A survey. Computer Networks , 54(15), 2787–2805.

Zanella, A., et al. (2014). Smart cities: A literature review. IEEE Internet of Things Journal , 1(1), 20–31.

Al-Turjman, F. (2020). Artificial intelligence-enabled smart things in the internet of things era. IEEE Access , 8, 98516–98525.

Mao, Y., et al. (2017). Mobile edge computing: Survey and research outlook. IEEE Communications Surveys & Tutorials , 22(3), 1624–1657.

Chen, M., Yang, Z., Saad, W., & Yin, C. (2020). A joint learning and communications framework for federated learning over wireless networks. IEEE Transactions on Wireless Communications , 19(10), 6576–6590.

Xu, L. D., He, W., & Li, S. (2014). Internet of Things in industries: A survey. IEEE Transactions on Engineering Management , 61(4), 868–880.

Zhang, Y., et al. (2018). Edge AI: On-demand accelerating deep neural network inference via edge computing. IEEE Transactions on Mobile Computing , 21(5), 1467–1480.

Ning, Z., et al. (2020). Intelligent resource scheduling in vehicular fog networks with reinforcement learning. IEEE Transactions on Vehicular Technology , 69(9), 9915–9926.

Bahl, P., Han, R., Li, F., & Satyanarayanan, M. (2002). Challenges in developing context-aware computing applications. Wireless Communications, IEEE , 9(5), 34–42.

IBM Research. (2021). AI and IoT Convergence: Smarter Decisions at the Edge . IBM White Paper.

Most read articles by the same author(s)

Qurbonov Behruz Amrulloyevich, Yondoshaliyev Alisher Elyorjon o‘g‘li, METHODS FOR CREATING NETWORKS SUPPORTING ARTIFICIAL INTELLIGENCE USING CLOUD TECHNOLOGIES , Journal of new century innovations : Vol. 79 No. 2 (2025)

Qurbonov Behruz Amrulloyevich, Muxtorov Maqsudbek Sherzodbek o‘g‘li, TECHNICAL ASPECTS OF CREATING AN EFFECTIVE PROGRAM FOR IOT DEVICES WITH ARTIFICIAL INTELLIGENCE IN PYTHON , Journal of new century innovations : Vol. 79 No. 2 (2025)

Qurbonov Behruz Amrulloyevich, Yondoshaliyev Alisher Elyorjon o‘g‘li, ENSURING USER SECURITY IN MOBILE APPLICATIONS: CYBERSECURITY TECHNIQUES , Journal of new century innovations : Vol. 79 No. 2 (2025)

Qurbonov Behruz Amrulloyevich, Muxtorov Maqsudbek Sherzodbek o‘g‘li, SECURE PLACEMENT OF WEB APPLICATIONS IN CLOUD SYSTEMS AND THEIR INTEGRATION WITH CI/CD , Journal of new century innovations : Vol. 79 No. 2 (2025)

Qurbonov Behruz Amrulloyevich, ADVANTAGES OF USING MACHINE LEARNING MODELS IN MOBILE APPLICATIONS: A SMART SOLUTION TO INTELLIGENT USER EXPERIENCE , Journal of new century innovations : Vol. 79 No. 2 (2025)

Qurbonov Behruz Amrulloyevich, Muxtorov Maqsudbek Sherzodbek o‘g‘li, METHODS FOR ANALYZING REAL-TIME WEB USERS USING ARTIFICIAL INTELLIGENCE , Journal of new century innovations : Vol. 79 No. 2 (2025)

Qurbonov Behruz Amrulloyevich, Yondoshaliyev Alisher Elyorjon o‘g‘li, FUNDAMENTALS OF IMPLEMENTING DATA SCIENCE PROJECTS IN THE PYTHON PROGRAMMING LANGUAGE , Journal of new century innovations : Vol. 79 No. 2 (2025)

Qurbonov Behruz Amrulloyevich, Muxtorov Maqsudbek Sherzodbek o‘g‘li, CREATION OF A SECURE PAYMENT SYSTEM INTEGRATED WITH ARTIFICIAL INTELLIGENCE USING BLOCKCHAIN TECHNOLOGY BASED ON JAVA , Journal of new century innovations : Vol. 79 No. 2 (2025)

Qurbonov Behruz Amrulloyevich, Yondoshaliyev Alisher Elyorjon o‘g‘li, USE OF ARTIFICIAL INTELLIGENCE IN CYBERSECURITY: POSSIBILITIES OF PREDICTING RISKS , Journal of new century innovations : Vol. 79 No. 2 (2025)