Loading

Exploring Iris Data: Classification Modeling and Visual Analyses

Description of image

Abstract

In this paper we delve into the data of types of Iris plants using two classification models Random Forest and Logistic Regression. The main goal of our research is to distinguish between Iris species based on their petal and sepal measurements. Additionally visual analyses are included in this study to help enhance our understanding of the data and the relationships, between features.

Introduction

The Iris plant holds importance in research and is widely recognized. The dataset comprises three species; Iris setosa, Iris virginica and Iris versicolor. These datasets offer an opportunity for developing and assessing machine learning classification models due to the features and clear differences, among the species.

Methods
Data

The study utilized techniques to analyze the Iris dataset, which contains 150 samples of Iris species, each characterized by four attributes; petal length, petal width, sepal length and sepal width.

Data Preprocessing

To prepare the data, for analysis any missing values were replaced with the values of their columns. The dataset was then split into training (80%). Testing (20%) subsets.

Modeling

For modeling purposes two classification algorithms. Random Forest and Logistic Regression. Were applied. The performance of both models was assessed using metrics like accuracy, recall and F1 score.

Results

The findings revealed that both models excelled in identifying Iris species achieving a perfect accuracy rate of 100% on the test set. Additionally visual representations such as scatter plots, bar graphs and heatmaps highlighted distinctions, between species based on their characteristics.

Discussion

The visual analyses demonstrated that the measured features could effectively distinguish between different Iris species. The results from the models indicated a high predictive power and potential applicability in other biological fields.

Conclusion

This study underscores that data mining can provide significant insights into biological diversity and the characteristics of plant species. Classification models have proven to be effective tools for predicting and analyzing Iris species, highlighting the importance of employing data mining techniques in scientific research.


https://github.com/marzieh135/projectiris