Analysis of Email Spam Detection Using Integration of Logistic Regression &Particle Swarm Optimization
A content-based text classification system can automatically categories the text document into predefined limited classes. But the email document classification is a challenging process in the modern internet environment. Initially, the email documents are lightly signified in a great dimensional features space, creating learning process and generalization (abstraction) process is problematic. Secondly, because of the high cost of the naming process of email documents where researchers are compulsory to gather the training data from the different sources from the different types of the target domain, which results in are a shift between test data and training data. Additionally, although unlabelled data is simply available it uses practical email classification for enhanced performance remains another challenge. Spam is unsolicited mail, the bulk of spam mails are sent by the spammers who use vast email programs to cover their characteristics, and send the spam mails every day with no money. The spam mail directs the various kinds of effects, including exposing unwanted images, decreasing the company productivity, blocking of Internet Service Providers’ (ISP) networks, and so on. Additionally, the spam mail contains a virus that is planned for some counterfeit activity. Thus, the robust and efficient spam mail filtering and classification process have to be used. Classification can be easily encoded as a multivariable optimization problem. When in a multidimensional newline space, a class prototype is represented by a centroid, classification can be seen as the problem of newline finding the optimal positions of all the class centroids i.e., determining for any centroid, its optimal coordinates. Newline PSO is very effective in solving multivariable problems, where variables take on real values which are taken as new lines stand-alone techniques to classify the datasets and to study the PSO-based Classifier for Multiclass Data Sets. Spam has become the platform of choice used by cyber-criminals to spread malicious payloads such as viruses and Trojans. Collaborative spam detection techniques can deal with large-scale e-mail data contributed by multiple sources and they have the well-known problem of requiring disclosure of e-mail content. one of the common solutions used for preserving the privacy of e-mail content is Distance-preserving hashes which enables the message classification for spam detection. PSO, a Big Data privacy-preserving collaborative spam detection platform built on top of a standard Map Reduce facility without PSO feature selection the training accuracy will be less with help of best dataset fit the classification result will be high.
Keywords - Email Spam Detection.