Science topic

Data Mining - Science topic

Explore the latest questions and answers in Data Mining, and find Data Mining experts.
Questions related to Data Mining
  • asked a question related to Data Mining
Question
2 answers
2024 4th International Conference on Computer, Remote Sensing and Aerospace (CRSA 2024) will be held at Osaka, Japan on July 5-7, 2024.
Conference Webiste: https://ais.cn/u/MJVjiu
---Call For Papers---
The topics of interest for submission include, but are not limited to:
1. Algorithms
Image Processing
Data processing
Data Mining
Computer Vision
Computer Aided Design
......
2. Remote Sensing
Optical Remote Sensing
Microwave Remote Sensing
Remote Sensing Information Engineering
Geographic Information System
Global Navigation Satellite System
......
3. Aeroacoustics
Aeroelasticity and structural dynamics
Aerothermodynamics
Airworthiness
Autonomy
Mechanisms
......
All accepted papers will be published in the Conference Proceedings, and submitted to EI Compendex, Scopus for indexing.
Important Dates:
Full Paper Submission Date: May 31, 2024
Registration Deadline: May 31, 2024
Conference Date: July 5-7, 2024
For More Details please visit:
Invitation code: AISCONF
*Using the invitation code on submission system/registration can get priority review and feedback
Relevant answer
Answer
Dear Kazi Redwan ,Regular Registration(4 - 6 pages) fee is 485 USD. Online presentation is accepted. All accepted papers will be published in the Conference Proceedings, and submitted to EI Compendex, Scopus for indexing.
For More Details about registration please visithttp://www.iccrsa.org/registration_all
For Paper submission: https://ais.cn/u/MJVjiu
  • asked a question related to Data Mining
Question
3 answers
I m currently doing a research on Data mining in Digital marketing and will like to get your opinion
1. The effects of mining and its impact in digital marketing
2. Does mining artificially alter organizations marketing campaign and if yes what are the pros and cons. if no, please state your reason or observations
3. is data mining the future of digital marketing, will mining determine the profitability of organizations in the nearest future.
4. Any other advise on this topic to aid my research.
Relevant answer
Answer
Let's talk about data mining in digital marketing. Basically, it's like digging for gold in a mountain of data. Here's the deal: data mining helps digital marketers understand their audience better than ever before. It's like having a superpower to predict what your customers want before they even know it themselves.First off, data mining gives marketers insights into customer behavior. It's like peeking into their minds to see what they like, what they buy, and how they interact online. This info is pure gold because it helps marketers tailor their messages and products to fit their audience perfectly.Then there's predictive analysis. This is where data mining gets really cool. By crunching numbers and patterns, marketers can predict future trends and behaviors. It's like having a crystal ball that tells you what your customers will do next.
This helps in planning marketing strategies ahead of time and staying ahead of the competition.Another big impact is personalization. With data mining, marketers can create hyper-personalized experiences for their customers. From targeted ads to personalized recommendations, it's all about making the customer feel special and understood. And when customers feel like you 'get' them, they're more likely to stick around and become loyal fans.But, of course, with great power comes great responsibility. Data mining raises some serious privacy concerns. Marketers need to be careful about how they collect and use customer data, making sure to respect their privacy and earn their trust.So yeah, data mining is a game-changer in digital marketing. It's like having a secret weapon that helps marketers understand their audience better, predict the future, and create personalized experiences that keep customers coming back for more. Pretty cool, right?
  • asked a question related to Data Mining
Question
3 answers
the most prominent commercial data mining software applications currently available to fraud examiners to assist with investigations?
Relevant answer
Answer
If you are doing financial fraud analysis I would recomend one of the three main tools traditionally used by CFE's
I personally used Active data and was very easy to use
If doing computer forensics, Encase , FTK or opensource equivalents should do (as long as you can testify to the accuracy of the results you can use the tool of your preference).
  • asked a question related to Data Mining
Question
4 answers
2024 4th International Conference on Machine Learning and Intelligent Systems Engineering (MLISE 2024) will be held on June 28- June 30, 2024 in Zhuhai China.
MLISE is conducting exciting series of symposium programs that connect researchers, scholars and students to industry leaders and highly relevant information. The conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. MLISE propose new ideas, strategies and structures, innovating the public sector, promoting technical innovation and fostering creativity in development of services.
---Call For Papers---
The topics of interest for submission include, but are not limited to:
1. Machine Learning
- Deep and Reinforcement learning
- Pattern recognition and classification for networks
- Machine learning for network slicing optimization
- Machine learning for 5G system
- Machine learning for user behavior prediction
......
2. Intelligent Systems Engineering
- Intelligent control theory
- Intelligent control system
- Intelligent information systems
- Intelligent data mining
- AI and evolutionary algorithms
......
All papers, both invited and contributed, will be reviewed by two or three experts from the committees. After a careful reviewing process, all accepted papers of MLISE 2024 will be published in the MLISE 2024 Conference Proceedings by IEEE (ISBN: 979-8-3503-7507-7), which will be submitted to IEEE Xplore, EI Compendex, Scopus for indexing.
Important Dates:
Submission Deadline: April 26, 2024
Registration Deadline: May 26, 2024
Conference Dates: June 28-30, 2024
For More Details please visit:
Invitation code: AISCONF
*Using the invitation code on submission system/registration can get priority review and feedback
Relevant answer
Answer
Yes, the conference is hybrid format,both online and offline could be accepted.
Submitting your papers to the system is free. Once your paper is accepted, you will need to pay the registration fee. The registration fee could be refer to the website: http://mlise.org/registration
  • asked a question related to Data Mining
Question
4 answers
Is there any Journals Free Scopus Journals for Data Mining field
Relevant answer
Answer
Dear Nouran Radwan Have a look here for some potentially interesting suggestions:
Best regards.
PS. Do realise that you also can consider a subscription-based journal (or hybrid journal where you decline the open access option), these are (most f the times) free of charge,
  • asked a question related to Data Mining
Question
1 answer
I m currently doing a research on Data mining in Digital marketing and will like to get your opinion
1. The effects of mining and its impact in digital marketing
2. Does mining artificially alter organizations marketing campaign and if yes what are the pros and cons. if no, please state your reason or observations
3. is data mining the future of digital marketing, will mining determine the profitability of organizations in the nearest future.
4. Any other advise on this topic to aid my research.
Relevant answer
Answer
Dear Nneka Olasetemi please do recommend my answer if helpful
Data mining has a significant impact on digital marketing, enabling marketers to leverage large volumes of data to make informed decisions, improve targeting, personalize content, and enhance overall marketing effectiveness. Here are some key impacts of data mining in digital marketing:
1. **Customer Insights and Segmentation**: Data mining allows marketers to analyze customer data to gain insights into behavior, preferences, and purchasing patterns. By segmenting customers based on these insights, marketers can tailor marketing campaigns to specific audience segments, improving relevance and engagement.
2. **Personalization**: With data mining, marketers can create personalized marketing messages, offers, and recommendations tailored to individual customers' preferences and past interactions. Personalization enhances the customer experience, fosters loyalty, and increases the likelihood of conversion.
3. **Predictive Analytics**: Data mining techniques such as predictive analytics enable marketers to forecast future trends, identify potential opportunities, and anticipate customer needs. By analyzing historical data, marketers can make data-driven predictions about customer behavior, market trends, and campaign performance, allowing for proactive decision-making and strategic planning.
4. **Optimized Targeting and Acquisition**: Data mining helps marketers identify high-value prospects and target them with relevant offers and content. By analyzing demographic, behavioral, and transactional data, marketers can identify potential customers who are most likely to convert and optimize their marketing efforts to acquire them cost-effectively.
5. **Customer Retention and Loyalty**: By analyzing customer data, marketers can identify at-risk customers and implement targeted retention strategies to reduce churn and foster loyalty. Data mining helps marketers understand the factors influencing customer loyalty and satisfaction, enabling them to tailor retention efforts and improve customer lifetime value.
6. **Campaign Optimization**: Data mining allows marketers to analyze the performance of marketing campaigns in real-time and optimize them for better results. By tracking key metrics such as click-through rates, conversion rates, and return on investment (ROI), marketers can identify areas for improvement and adjust their strategies accordingly to maximize effectiveness.
7. **Competitive Intelligence**: Data mining enables marketers to gather insights into competitors' strategies, market positioning, and customer behavior. By analyzing publicly available data and monitoring competitors' activities, marketers can identify emerging trends, benchmark performance, and stay ahead of the competition.
Overall, data mining empowers digital marketers with actionable insights, enabling them to make informed decisions, improve targeting and personalization, optimize campaigns, and drive better results in today's competitive digital landscape.
  • asked a question related to Data Mining
Question
1 answer
2024 IEEE 7th International Conference on Computer Information Science and Application Technology (CISAT 2024) will be held on July 12-14, 2024 in Hangzhou, China.
---Call For Papers---
The topics of interest for submission include, but are not limited to:
◕ Computational Science and Algorithms
· Algorithms
· Automated Software Engineering
· Bioinformatics and Scientific Computing
......
◕ Intelligent Computing and Artificial Intelligence
· Basic Theory and Application of Artificial Intelligence
· Big Data Analysis and Processing
· Biometric Identification
......
◕ Software Process and Data Mining
· Software Engineering Practice
· Web Engineering
· Multimedia and Visual Software Engineering
......
◕ Intelligent Transportation
· Intelligent Transportation Systems
· Vehicular Networks
· Edge Computing
· Spatiotemporal Data
All papers, both invited and contributed, the accepted papers, will be published and submitted for inclusion into IEEE Xplore subject to meeting IEEE Xplore's scope and quality requirements, and also submitted to EI Compendex and Scopus for indexing. All conference proceedings paper can not be less than 4 pages.
Important Dates:
Full Paper Submission Date: April 14, 2024
Submission Date: May 12, 2024
Registration Deadline: June 14, 2024
Conference Dates: July 12-14, 2024
For More Details please visit:
Invitation code: AISCONF
*Using the invitation code on submission system/registration can get priority review and feedback
Relevant answer
Please let me know if anyone is interested to o
  • asked a question related to Data Mining
Question
1 answer
2024 5th International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2024) will be held in Shenzhen, China, from June 14 to 16, 2024.
---Call For Papers---
The topics of interest for submission include, but are not limited to:
(1) Artificial Intelligence
- Intelligent Control
- Machine learning
- Modeling and identification
......
(2) Sensor
- Sensor/Actuator Systems
- Wireless Sensors and Sensor Networks
- Intelligent Sensor and Soft Sensor
......
(3) Control Theory And Application
- Control System Modeling
- Intelligent Optimization Algorithm and Application
- Man-Machine Interactions
......
(4) Material science and Technology in Manufacturing
- Artificial Material
- Forming and Joining
- Novel Material Fabrication
......
(5) Mechanic Manufacturing System and Automation
- Manufacturing Process Simulation
- CIMS and Manufacturing System
- Mechanical and Liquid Flow Dynamic
......
All accepted papers will be published in the Conference Proceedings, which will be submitted for indexing by EI Compendex, Scopus.
Important Dates:
Full Paper Submission Date: April 1, 2024
Registration Deadline: May 31, 2024
Final Paper Submission Date: May 14, 2024
Conference Dates: June 14-16, 2024
For More Details please visit:
Invitation code: AISCONF
*Using the invitation code on submission system/registration can get priority review and feedback
Relevant answer
Answer
Data science
  • asked a question related to Data Mining
Question
7 answers
2024 3rd International Conference on Biomedical and Intelligent Systems (IC-BIS 2024) will be held from April 26 to 28, 2024, in Nanchang, China.
It is a comprehensive conference which focuses on Biomedical Engineering and Artificial Intelligent Systems. The main objective of IC-BIS 2024 is to address and deliberate on the latest technical status and recent trends in the research and applications of Biomedical Engineering and Bioinformatics. IC-BIS 2024 provides an opportunity for the scientists, engineers, industrialists, scholars and other professionals from all over the world to interact and exchange their new ideas and research outcomes in related fields and develop possible chances for future collaboration. The conference also aims at motivating the next generation of researchers to promote their interests in Biomedical Engineering and Artificial Intelligent Systems.
Important Dates:
Registration Deadline: March 26, 2024
Final Paper Submission Date: April 22, 2024
Conference Dates: April 26-28, 2024
---Call For Papers---
The topics of interest for submission include, but are not limited to:
- Biomedical Signal Processing and Medical Information
· Biomedical signal processing
· Medical big data and machine learning
· Application of artificial intelligent for biomedical signal processing
......
- Bioinformatics & Intelligent Computing
· Algorithms and Software Tools
· Algorithms, models, software, and tools in Bioinformatics
· Biostatistics and Stochastic Models
......
- Gene regulation, expression, identification and network
·High-performance computational systems biology and parallel implementations
· Image Analysis
· Inference from high-throughput experimental data
......
For More Details please visit:
Relevant answer
Answer
Veryy nice I interesting
  • asked a question related to Data Mining
Question
3 answers
I am trying to train a CNN model in Matlab to predict the mean value of a random vector (the Matlab code named Test_2 is attached). To further clarify, I am generating a random vector with 10 components (using rand function) for 500 times. Correspondingly, the figure of each vector versus 1:10 is plotted and saved separately. Moreover, the mean value of each of the 500 randomly generated vectors are calculated and saved. Thereafter, the saved images are used as the input file (X) for training (70%), validating (15%) and testing (15%) a CNN model which is supposed to predict the mean value of the mentioned random vectors (Y). However, the RMSE of the model becomes too high. In other words, the model is not trained despite changing its options and parameters. I would be grateful if anyone could kindly advise.
Relevant answer
Answer
Dear Renjith Vijayakumar Selvarani and Dear Qamar Ul Islam,
Many thanks for your notice.
  • asked a question related to Data Mining
Question
3 answers
..
Relevant answer
Answer
Dear Doctor
"Data mining is the probing of available datasets in order to identify patterns and anomalies. Machine learning is the process of machines (a.k.a. computers) learning from heterogeneous data in a way that mimics the human learning process."
  • asked a question related to Data Mining
Question
2 answers
Can any one suggest this topic is better for PhD work or not. Topic is "Study on the Data Mining Techniques in Healthcare Sector with emphasis on Breast Cancer".
Relevant answer
Answer
Dear friend Pushpraj Singh
A PhD topic in data mining techniques in the healthcare sector, with a focus on breast cancer, offers a unique opportunity for breakthroughs. The intersection of data mining and healthcare presents a wealth of potential insights, and your research could have a significant impact on people's lives. While the challenges are significant, the potential rewards are great, and tackling real-world problems is the beauty of it. If you're up for the challenge, this topic is definitely worth considering.
My publication might be interesting to read:
  • asked a question related to Data Mining
Question
1 answer
What are the possibilities of applying generative AI in terms of conducting sentiment analysis of changes in Internet users' opinions on specific topics?
What are the possibilities of applying generative artificial intelligence in carrying out sentiment analysis on changes in the opinions of Internet users on specific topics using Big Data Analytics and other technologies typical of Industry 4.0/5.0?
Nowadays, Internet marketing is developing rapidly, including viral Internet marketing used on social media sites, among others, in the form of, for example, Real-Time marketing in the formula of viral marketing. It is also marketing aimed at precisely defined groups, audience segments, potential customers of a specific advertised product and/or service offering. In terms of improving Internet marketing, new ICT information technologies and Industry 4.0/5.0 are being implemented. Marketing conducted in this form is usually preceded by market research conducted using, among other things, sentiment analysis of the preferences of potential consumers based on verification of their activity on the Internet, taking into account comments written on various websites, Internet forums, blogs, posts written on social media. In recent years, the importance of the aforementioned sentiment analysis carried out on large data sets using Big Data Analytics has been growing, thanks to which it is possible to study the psychological aspects of the phenomena of changes in the trends of certain processes in the markets for products, services, factor markets and financial markets. The development of the aforementioned analytics makes it possible to study the determinants of specific phenomena occurring in the markets caused by changes in consumer or investor preferences, caused by specific changes in the behavior of consumers in product and service markets, entrepreneurs in factor markets or investors in money and capital markets, including securities markets. The results from these analyses are used to forecast changes in the behavior of consumers, entrepreneurs and investors that will occur in the following months and quarters. In addition to this, sentiment analyses are also conducted to determine the preferences, awareness of potential customers, consumers in terms of recognition of the company's brand, its offerings, description of certain products and services, etc., using textual data derived from comments, entries, posts, etc. posted by Internet users, including social media users on a wide variety of websites. The knowledge gained in this way can be useful for companies to plan marketing strategies, to change the product and service offerings produced, to select or change specific distribution channels, after-sales services, etc. This is now a rapidly developing field of research and the possibilities for many companies and enterprises to use the results of this research in marketing activities, but not only in marketing. Recently, opportunities are emerging to apply generative artificial intelligence and other Industry 4.0/5.0 technologies to analyze large data sets collected on Big Data Analytics platforms. In connection with the development of intelligent chatbots available on the Internet, recently there have been discussions about the possibilities of potential applications of generative artificial intelligence, 5G and other technologies included in the Industry 4.0/5.0 group in the context of using the information resources of the Internet to collect data on citizens, companies, institutions, etc. for their analysis carried out using, among other things, sentiment analysis to determine the opinion of Internet users on certain topics or to define the brand recognition of a company, the evaluation of product or service offerings by Internet users. In recent years, the scope of applications of Big Data technology and Data Science analytics, Data Analytics in economics, finance and management of organizations, including enterprises, financial and public institutions is increasing. Accordingly, the implementation of analytical instruments of advanced processing of large data sets in enterprises, financial and public institutions, i.e. the construction of Big Data Analytics platforms to support organizational management processes in various aspects of operations, including the improvement of customer relations, is also growing in importance. In recent years, ICT information technologies, Industry 4.0/5.0 including generative artificial intelligence technologies are particularly rapidly developing and finding application in knowledge-based economies. These technologies are used in scientific research and business applications in commercially operating enterprises and in financial and public institutions. In recent years, the application of generative artificial intelligence technologies for collecting and multi-criteria analysis of Internet data can significantly contribute to the improvement of sentiment analysis of Internet users' opinions and the possibility of expanding the applications of research techniques carried out on analytical platforms of Business Intelligence, Big Data Analytics, Data Science and other research techniques using ICT information technology, Internet and advanced data processing typical Industry 4. 0/5.0. Most consumers of online information services available on new online media, including social media portals, are not fully aware of the level of risk of sharing information about themselves on these portals and the use of this data by technological online companies using this data for their analytics. I am conducting research on this issue. I have included the conclusions of my research in scientific publications, which are available on Research Gate. I invite you to cooperate with me.
In view of the above, I address the following question to the esteemed community of scientists and researchers:
What are the possibilities for the application of generative AI in terms of conducting sentiment analysis of changes in the opinions of Internet users on specific topics using Big Data Analytics and other technologies typical of Industry 4.0/5.0?
What are the possibilities of using generative AI in conducting sentiment analysis of Internet users' opinions on specific topics?
And what is your opinion on this topic?
What is your opinion on this issue?
Please answer,
I invite everyone to join the discussion,
Thank you very much,
Best wishes,
Dariusz Prokopowicz
The above text is entirely my own work written by me on the basis of my research.
In writing this text I did not use other sources or automatic text generation systems.
Dariusz Prokopowicz
Relevant answer
Answer
In today's digital age, the internet has become a breeding ground for opinions and sentiments on various topics. With the advent of Industry 4.0/5.0 technologies, such as big data analytics and generative AI, there are endless possibilities for conducting sentiment analysis on changes in the opinions of internet users.
Generative AI, powered by machine learning algorithms, can analyze vast amounts of data to identify patterns and trends in user sentiments. By leveraging big data analytics, this technology can sift through massive datasets to extract valuable insights regarding specific topics. This allows businesses and organizations to understand public opinion better and make informed decisions based on these sentiments.
One significant advantage of using generative AI for sentiment analysis is its ability to adapt and evolve with changing opinions. As public sentiment fluctuates over time, traditional methods may struggle to keep up with these changes. However, generative AI can continuously learn from new data inputs and adjust its analysis accordingly.
Furthermore, the application of generative AI in sentiment analysis can provide real-time insights into public opinion. This is particularly useful during times of crisis or when monitoring social trends that impact businesses or governments. By analyzing social media posts, online reviews, and other forms of user-generated content in real-time, generative AI can help identify emerging sentiments before they become mainstream.
However, it is important to note that while generative AI offers immense potential for sentiment analysis on specific topics using big data analytics within Industry 4.0/5.0 technologies, ethical considerations must be taken into account as well. Privacy concerns surrounding the collection and use of personal data must be addressed transparently to ensure trust between users and technology providers.
  • asked a question related to Data Mining
Question
2 answers
Hello there, I am in the search for datasets of software's requirements and their use cases, in hope to be able to gather datasets of use case for the requirements to train a ML model for a research we're working on. Would anyone know any source to find such datasets ?
Relevant answer
Answer
Najib Abusalbi I did yes, I searched in datasets websites like hugging face and kaggle, google datasets, searched on Google search engine and Google Scholars, and across journals and many websites, I didn't manage to find any public repository except the one made by the National council of Italy, other than that, did not find datasets, even searching in published papers and articles, no one mention from where they got their datasets or where it can be available, a few who do that sadly.
  • asked a question related to Data Mining
Question
3 answers
I want to analyze the research problem in education data mining, with machine learning algorithms, I want to build a model that suggest school students which domain to select for higher education, with evaluating the dataset of student as well as the dataset of higher education of the same student.
Relevant answer
Answer
Impact of social media on students in post covid period.
Impact of mobile usage on students in higher education post covid period
  • asked a question related to Data Mining
Question
12 answers
What is spatial and temporal mining in data mining and what are spatial data structures in data mining?
Relevant answer
Answer
Dr Mohammad Imam thank you for your contribution to the discussion
  • asked a question related to Data Mining
Question
20 answers
I am interested to get depper to the connection between data analysis methods and information visualization that can be generated by this data analysis. For example, data clustering (in data mining) produces a certain kind of information. Which visualization method could be used to best visualize the produced information and why?
I have found this http://www.visual-literacy.org/periodic_table/periodic_table.html which very good on depicting the different visualization methods but lacks explaining to what data analysis method each one of them it is connected.
Any recommended good source?
Thanks
Relevant answer
Answer
Just a quick answer to data visualization. I do highly recommemd to learn Python and use matplotlib to visualize data.  There are already existing many libraries focussing on data mining, AI, and machine learning.
There are available many courses online, books, and python manuals.
Learning Python to work with data is really worth it. For starters, the best is to find a YouTube video on the topic you want to solve or some close one.
References:
[1] Joakim Sundnes: "Introduction to Scientific Programming with Python", Springer, Simula SpringerBriefs on Computing (2010) ISBN 978-3-030-50355-0 (Open Access https://doi.org/10.1007/978-3-030-50356-7
[2] H.P. Langtangen: A Primer on Scientific Programming with Python, Texts in Computational Science and Engineering 6, Springer, DOI 10.1007/978-3-662-49887-3
  • asked a question related to Data Mining
Question
3 answers
Please suggest new research topic new computer science in data mining using machine learning
Relevant answer
Answer
Dear Pushpraj Singh,
I give my proposal for a research topic, research thesis, thesis concept in the area of your interest:
Research Context: In recent years, the scale of various economic, financial, social, health, food, energy, nature, climate, etc. crises is increasing. As a result, the importance of improving crisis management techniques and using new ICT information technologies and Industry 4.0 for this purpose is growing. The importance of improving risk management processes using new Industry 4.0 technologies, including but not limited to i.e. Big Data Analytics and Artificial Intelligence, is also growing.
Accordingly, the research topic may address the following issue: The application of selected ICT information technologies, Industry 4.0, the technologies of the current fourth technological revolution, including Big Data Analytics, machine learning, deep learning, artificial intelligence to improve risk management systems, early warning systems within the framework of crisis management, and the improvement of forecasting models used to predict abnormal situations, events of special risk increases, emergencies, specific types of disasters, etc.
I would like to invite you to join me for scientific cooperation on this issue,
Kind regards,
Dariusz Prokopowicz
  • asked a question related to Data Mining
Question
2 answers
Compared to the old-fashioned and currently used emulsion type explosives, the explosive filling of the tunnel face with bulk charging provides better and higher quality vibration values. if you are drilling in the tunnel face with the Mwd (measurement while drilling) featured jumbo. Because with the mwd-capable machine, heterogeneous drilling is performed in the formation whose face surface is uneven and the drilling lengths are different. Therefore, a homogeneous charge in a heterogeneous face with an emulsion-type explosive of constant kilogram will be difficult. Therefore, I think that more stable vibration data will be obtained with bulk charging. What is your opinion?
Relevant answer
Answer
I obtained an empirical formula with 95% accuracy rate with emulsion type explosive. Thank you very much for your esteemed reply. I think I can get more accurate results with bulk charging. Thank you very much for your interest, Mr. Signh.
  • asked a question related to Data Mining
Question
10 answers
I created my own huge dataset from different sites and labeled it on some NLP task. How can i publish it in form of Paper or article and where?
Relevant answer
Answer
Publishing your own created labeled corpus can be done through various avenues depending on your goals and the field you're working in. If you wish to contribute to the academic community and share your research findings, publishing it in the form of an article or paper in relevant journals or conference proceedings would be appropriate. This allows you to provide a detailed description of your corpus creation process, its applications, and potential insights derived from it. Alternatively, you could explore open-access platforms or repositories specific to linguistic resources, such as the Linguistic Data Consortium (LDC), where researchers can deposit and share their corpora. Additionally, if your corpus is of significant value and relevance, you may consider reaching out to organizations or institutions involved in language processing or research, as they may be interested in hosting and making it accessible to others in the field.
  • asked a question related to Data Mining
Question
5 answers
Hello everyone,
I want to find emerging pattern of blockchain applications in cybersecurity . I’ve collected and filtered my dataset which now consists of 1183 research items indexed in WoS and scopus. Which text mining algorithms can fulfill the purpose?
I found burst detection and LDA suitable but as a tourism student i want to know about other possibilities and the suggestions of professionals.
Best wishes.
Relevant answer
Answer
One text mining algorithm that can fulfill the purpose of identifying emerging patterns of blockchain applications in cybersecurity from your dataset of 1183 research items indexed in WoS and Scopus is topic modeling using Latent Dirichlet Allocation (LDA). LDA is a probabilistic model that can discover hidden topics within a collection of documents by assigning probability distributions to words and topics. By applying LDA to your dataset, you can uncover the underlying themes and topics related to blockchain applications in cybersecurity. This algorithm can help identify patterns, common trends, and relationships among the research items, enabling you to gain insights into the emerging patterns in this domain.
  • asked a question related to Data Mining
Question
4 answers
Hello everyone, I’m currently working on my masters thesis in which I want to find current and future application patterns of a technology in an industry based on previous researchers done regarding the topic by analyzing the tittle, abstract, conclusion and implications of these article if it is even possible but I’m not sure which data mining method and algorithm should I use to get the best possible results. It would be great if you could give me advices and feedbacks.
Best regards.
Relevant answer
Answer
Choosing the right data mining method and algorithm depends on your use case. There are many different data mining methods and algorithms available, each with its own strengths and weaknesses. Some of the most popular data mining methods include clustering, classification, regression, and association rule mining. To determine which method is best for your use case, you should consider factors such as the size of your dataset, the type of data you are working with, and the specific problem you are trying to solve.
  • asked a question related to Data Mining
Question
7 answers
..
Relevant answer
Answer
Data Processing and Data Mining are both essential components of the data analysis process, but they have distinct purposes and methods. Here's a breakdown of the key differences between the two:
Data Processing: Data processing refers to the manipulation and transformation of raw data into a more meaningful and organized format. It involves various operations that cleanse, validate, integrate, and format data to make it suitable for further analysis. The primary goal of data processing is to ensure data quality, consistency, and reliability. It typically includes tasks such as data cleaning, data transformation, data aggregation, and data summarization. Data processing focuses on preparing data for efficient storage, retrieval, and analysis.
Data Mining: Data mining, on the other hand, is a specific technique or process within data analysis that involves discovering patterns, relationships, and insights from a large volume of data. It employs statistical and mathematical algorithms, machine learning techniques, and data visualization tools to extract knowledge and actionable information from the data. Data mining aims to uncover hidden patterns, trends, correlations, or anomalies that are not readily apparent. It can be used to solve specific business problems, predict future outcomes, identify market trends, or support decision-making processes.
In summary, data processing is the broader concept that encompasses the overall handling and preparation of data, ensuring its quality and consistency. Data mining, on the other hand, is a focused analysis technique that aims to extract valuable insights and knowledge from processed data by applying various statistical and machine-learning algorithms.
  • asked a question related to Data Mining
Question
1 answer
Resea
Relevant answer
Answer
Dear Nimota Jabaar Biobaku,
attached is a short bibliography where you can find some information about the relationship between Data Mining and SDN.
Best regards and much success
Anatol Badach
Kyriakos Sideris, Reza Nejabati, Dimitra Simeonidou: „Seer: Empowering Software Defined Networking with Data Analytics“; 15th International Conference on Ubiquitous Computing and Communications and 2016 International Symposium on Cyberspace and Security (IUCC-CSS), Dec 2016
Albert Mestres et al.: Knowledge-Defined Networking; ACM SIGCOMM Computer Communication Review, Vol. 47 Issue 3, Jul 2017
Haojun Huang et al.: Data-Driven Information Plane in Software-Defined Networking; IEEE Communications Magazine, Vol. 55, Issue 6, Jun 2017
Tam Nguyen: “The Challenges in SDN/ML Based Network Security: A Survey”; arXiv:1804.03539v2 [cs.CR], Apr 2018
Juliana Arevalo Herrera1, Jorge E. Camargo: A Survey on Machine Learning Applications for Software Defined Network Security; International Conference on Applied Cryptography and Network Security (ACNS), Aug 2019
Yuhong Li, Xiang Su, Aaron Yi Ding et al.: „Enhancing the Internet of Things with Knowledge-Driven Software-Defined Networking Technology: Future Perspectives”; Sensors (MDPI), Vol. 20, Jun 2020
  • asked a question related to Data Mining
Question
4 answers
My team and I are trying to open a dialogue about designing a Continuum of Realism for synthetic data. We want to develop a meaningful way to talk about data in terms of the degree of realism that is necessary for a particular task. We feel the way to do this is by defining a continuum that shows that as data becomes more realistic, the analytic value increases, but so does the cost and risk of disclosure. Everyone seems to be interested in generating the most realistic data, but let's be honest, sometimes that's not the level of realism that we actually need. It is expensive and carries a high reidentification risk when working with PII. Sometimes we just need data to test our code, and we can't justify using this level of realism when the risk is so high. Have you also encountered this issue? Are you interested in helping us fulfill our mission? Ultimately we are trying to save money and protect consumer privacy. We would love to hear your thoughts!
Relevant answer
Answer
Yes, there is a continuum of realism for synthetic data. At one end of the continuum, we have completely synthetic data that is generated based on mathematical models or simulations. This type of data can be useful for testing hypotheses, exploring different scenarios, and evaluating methods without the constraints and biases of real-world data. However, it may not reflect the complexity and diversity of real-world data, and may not be useful for certain applications, such as training machine learning models.
At the other end of the continuum, we have real-world data that is collected directly from sources such as surveys, medical records, or social media platforms. This type of data can provide a rich and diverse representation of the phenomena of interest but may be limited by factors such as sample size, data quality, and ethical considerations.
Between these two extremes, we have various levels of realism that can be achieved through the use of synthetic data. For example, data may be generated based on real-world data using methods such as data augmentation or data synthesis, which can create new data points that are similar to the real data but with some degree of randomness or variability. Alternatively, data may be generated based on simulations or generative models that incorporate known properties of the real-world data, such as distributional properties or relationships between variables.
As for your second question, as an AI language model, I am always ready to provide help and guidance on topics related to synthetic data and statistics. Please let me know if there is anything specific that I can assist you with.
  • asked a question related to Data Mining
Question
4 answers
It will be for a data mining research that the objective is to classify the best time of day for the operation of the wind farm.
Relevant answer
Answer
Laura Peçanha There are a number of wind datasets that include the factors you mentioned. The National Renewable Energy Laboratory (NREL), which maintains a comprehensive database of wind resource data for the United States, is one such source. The NREL wind resource database contains observations of wind speed, direction, and temperature at various heights above ground, as well as air density and turbulence strength. The data is delivered hourly and covers a variety of time periods based on the region.
The European Centre for Medium-Range Weather Forecasts (ECMWF) is another viable source of wind database, as it offers worldwide atmospheric reanalysis data that includes wind speed, direction, and temperature. The ECMWF data is accessible at several temporal resolutions, including hourly, and may be downloaded.
Other institutions and commercial enterprises that provide wind database services include AWS Truepower and Vaisala. These firms offer high-quality wind data and analytic tools that may be customized to meet unique research requirements.
In conclusion, various wind datasets are available that cover the variables you need for your research. Exploring numerous sources and evaluating data quality and relevance to your unique study objectives may be beneficial.
  • asked a question related to Data Mining
Question
3 answers
I would need a (tabular, i.e. not imaging or text) dataset with a hierarchically structured outcome to use as an example dataset in a new R package (but the dataset can be of any format, e.g. txt, csv or arff). It should be single-label and tree-structured, e.g. first level: classes 1, ..., 4, second level: 1.1, 1.2, 1.3, 2.1,2.2, third level: 1.1.1, 1.1.2, 1.2.1, 1.2.2, 1.2.3, 1.3.1, 1.3.2., ... .
Relevant answer
Answer
The labeling scheme you want to use is also popular when it comes to indexing semistructured documents (such as XML-documents), e.g. there is a labeling schema called ORDPATH:
With this schema, you could take any real-world collection of XML-documents and turn it into a dataset consisting of labels.
  • asked a question related to Data Mining
Question
3 answers
I want to develop a research about higher school dropout and would like some help on this topic.
Relevant answer
Answer
Yes, there are several approaches to addressing the issue of high school dropout rates beyond data mining, here are a few:
Implement targeted interventions: work with schools and communities to identify students at risk of dropping out and provide targeted interventions such as mentoring, tutoring, and after-school programs to keep them engaged and help them succeed.
Address underlying social determinants of academic success: Identify and address non-academic factors that contribute to high dropout rates, such as poverty, lack of access to healthcare, housing instability, and discrimination.
Provide alternative pathways to success: Support alternative routes to obtaining a high school diploma, such as vocational training, apprenticeships, and alternative learning programs like online or blended learning.
Foster a positive school culture: prioritize creating a positive school culture that values academic success, supports student engagement and wellbeing, and provides a safe and inclusive learning environment.
These approaches can all work together to tackle the complex and multifaceted issue of high school dropout rates. It is important to consider each in the context of the specific challenges and opportunities of the community being served.
  • asked a question related to Data Mining
Question
1 answer
yes. for further details contact now
Relevant answer
Answer
Yes. I suggest doing a search of ResearchGate using the terms "r package topic model" and following up on the top articles on topic modeling in R. There are other packages you can find by browsing the CRAN archives of R packages, but these articles are a good place to start.
I also recommend the book Text Analysis with R for Students of Literature by Matthew Jockers and Rosamond Thalken.
  • asked a question related to Data Mining
Question
2 answers
I am writing PhD thesis on data mining. How I can write a good "thesis Innovations"? What are the key points?
Relevant answer
Answer
Ajit Singh Thanks for your valuable comment
  • asked a question related to Data Mining
Question
4 answers
The data that is obtained from the institution database is to analyze the GPA and CGPA of 1000 students. The attributes obtained are demographic but no behavioral, income, etc. What type of data mining technique can be used to analyze this type of attributes and obtain patterns from the analysis?
Please do give reference in regards to how the techniques can be applied.
Thank you! Appreciate it.
Relevant answer
Answer
One educational data mining technique that can be used to analyze students' performance attributes via patterns is called "cluster analysis".
Cluster analysis is a statistical technique that involves grouping similar observations or data points together based on their attributes or characteristics. In the context of education, cluster analysis can be used to identify patterns in students' performance attributes, such as grades, test scores, attendance records, or behavior.
For example, a school may collect data on students' performance attributes over a period of time and use cluster analysis to group students who exhibit similar patterns of behavior or academic performance. This can help identify groups of students who may require additional support or resources to succeed, as well as inform instructional strategies and curriculum development.
Another educational data mining technique that can be used to analyze students' performance attributes via patterns is "association rule mining". Association rule mining involves identifying patterns and relationships between variables in large datasets. In the context of education, association rule mining can be used to identify correlations between students' performance attributes and other factors such as demographic information, socioeconomic status, or extracurricular activities. This can help schools and educators better understand the factors that influence student performance and make informed decisions about how to support students in their learning.
  • asked a question related to Data Mining
Question
4 answers
Hello dear researchers,
I've just accepted in doctoral program with data approximately consisted of thousands observations. I am planning on data mining first to explore, classify, associate, and detecting anomaly. I used to work with Stata and wondering if stata can do such things. Do you have any suggestions about reference that connecting Stata and data mining?
Relevant answer
Answer
Dear university staff!
I inform you that my lecture on electronic medicine on the topic: "The use of automated system-cognitive analysis for the classification of human organ tumors" can be downloaded from the site: https://www.patreon.com/user?u =87599532
Lecture with sound in English. You can download it and listen to it at your convenience.
Sincerely,
Vladimir Ryabtsev, Doctor of Technical Science, Professor Information Technologies.
  • asked a question related to Data Mining
Question
7 answers
Hi,
Most of the researchers knew R Views website which is:
Please, I am wondering if this website contains all R packages available for researchers.
Thanks & Best wishes
Osman
Relevant answer
Answer
no need to buy R
  • asked a question related to Data Mining
Question
4 answers
I am completely new to WEKA and I am trying to load this file that I got from kaggle to WEKA but is meet with error. How do I find the solution to change the format of .crv to ARFF file.
this is where I got the file, and I have cleaned the extra columns
Thank you very much.
Relevant answer
Answer
In ur file some data types may be mismatched. check date and name. In Name some bad characters
  • asked a question related to Data Mining
Question
3 answers
my topic is the " fraud detection in banking sector by using data mining techniques " so i am looking for the data set in banking and how t use that data set.
Relevant answer
Answer
A machine learning dataset is a collection of data that is used to train the model. A dataset acts as an example to teach the machine learning algorithm how to make predictions. ... The common types of data include:
  1. Text data.
  2. Image data.
  3. Audio data.
  4. Video data.
  5. Numeric data.
  • asked a question related to Data Mining
Question
48 answers
The current technological revolution, known as Industry 4.0, is determined by the development of the following technologies of advanced information processing:
Big Data database technologies, cloud computing, machine learning, Internet of Things, artificial intelligence, Business Intelligence and other advanced data mining technologies.
Which of these technologies are applicable or will be used in the future in the education process?
Please reply
Best wishes
Relevant answer
Answer
Let’s be clear: the metaverse (however you define it) is decades away.
Which is not to say that it can be ignored in the meantime. Because while it may seem like science fiction or over-inflated hype at the moment, the fact remains that huge amounts of money and effort are being poured into making it happen – and educators need to at least be aware of its possible implications...
  • asked a question related to Data Mining
Question
6 answers
Hi everyone,
I am facing this problem in my MA thesis:
I have two time series datasets. The first dataset has numerical features and the second one has binary variables. I found in the literature these three methods that are able to determine the correlation between the two datasets:
- logistic regression
-biserial point correlation
- Kruskal Wallis H test
Unfortunately, I could not find out if these methods are still applicable when the data are time series? I would appreciate some advice/explanations to figure this out :)
another question would be if I can use one of these methods, are there any limitations if my continuous variable is nonlinear?
Thanks in advance for your help :)
PS
both my datasets are stationary
# Data mining #correlation #time series analysis
Relevant answer
Answer
i have done the Dickey-Fuller test to check the stationarity of my features in the dataset. the result of statistic value was less than the critical value at 1%; p-value<<< 0,5% which leads to reject null hypothesis.As ive understood from the literature Rejecting the null hypothesis means that the process has no unit root, and in turn that the time series is stationary. Do you know way to test the linearity of the features than just doing a linear regression and checking R2?
  • asked a question related to Data Mining
Question
5 answers
What do you consider are the implications of Big Data on urban planning practice?
Relevant answer
Glory be to Allah... As time progresses, new developments appear that help people to complete their needs with flexibility and ease.
  • asked a question related to Data Mining
Question
4 answers
Good evening dear researchers,
I have a data set from KEGG database. it is in CSV format. I was trying to convert it into arff format using WEKA for further analysis.
It keeps giving me an error saying that it is not recognized by WEKA as a csv file. I searched for it, then I found that the file needs to be cleansed and put into a suitable data structure for it to be valid and ready to be analyzed.
unfortunately, I do not have the ability or the knowledge now to do that and I need it as soon as possible.
Can anyone help with the problem? thank you so much for your time.
kind regards.
attachments:
-data set csv file.
-error png clip.
Relevant answer
Answer
ata cleaning is the process of fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset. When combining multiple data sources, there are many opportunities for data to be duplicated or mislabeled. If data is incorrect, outcomes and algorithms are unreliable, even though they may look correct. There is no one absolute way to prescribe the exact steps in the data cleaning process because the processes will vary from dataset to dataset. But it is crucial to establish a template for your data cleaning process so you know you are doing it the right way every time.
Regards,
Shafagat
  • asked a question related to Data Mining
Question
4 answers
We are looking at the application of data mining in water quality space. There are several articles to begin with and refer, and it is a bit confusing. Trying to narrow down the scope.
Relevant answer
Answer
The objectives in evaluating River profile in urban center s
  • asked a question related to Data Mining
Question
1 answer
Hi everyone, well the thing is im trying to apply spatial data mining to a set of vector and raster files so i need a way to convert my raster archives into a csv in order to run the mining
A little bit of background, my thesis is about applying data mining in archeology with the intention of modeling archaeological sites, currently im struggling to convert the rasters in csv to run the data mining
Relevant answer
Answer
The Export Raster pane allows you to export the entire raster dataset, mosaic dataset, image service or the portion in the display.
  1. In the Contents pane, right-click the raster layer you want to export, click Data, and click Export Raster. ...
  2. Choose the appropriate output as required in the Output Raster Dataset field.
  • asked a question related to Data Mining
Question
1 answer
Could you please recommend to me a package or tool for the drop3 instance selection method?
Relevant answer
Answer
Hopefully this link will help:
  • asked a question related to Data Mining
Question
9 answers
Well,
I am a very curious person. During Covid-19 in 2020, I through coded data and taking only the last name, noticed in my country that people with certain surnames were more likely to die than others (and this pattern has remained unchanged over time). Through mathematical ratio and proportion, inconsistencies were found by performing a "conversion" so that all surnames had the same weighting. The rest, simple exercise of probability and statistics revealed this controversial fact.
Of course, what I did was a shallow study, just a data mining exercise, but it has been something that caught my attention, even more so when talking to an Indian researcher who found similar patterns within his country about another disease.
In the context of pandemics (for the end of these and others that may come)
I think it would be interesting to have a line of research involving different professionals such as data scientists; statisticians/mathematicians; sociology and demographics; human sciences; biological sciences to compose a more refined study on this premise.
Some questions still remain:
What if we could have such answers? How should Research Ethics be handled? Could we warn people about care? How would people with certain last names considered at risk react? And the other way around? From a sociological point of view, could such a recommendation divide society into "superior" or "inferior" genes?
What do you think about it?
=================================
Note: Due to important personal matters I have taken a break and returned with my activities today, February 13, 2023. I am too happy to come across many interesting feedbacks.
Relevant answer
Answer
It is just coincidental
  • asked a question related to Data Mining
Question
3 answers
Dear all,
Why forward selection search is very popular and widely used in FS based on mutual information such as MRMR, JMI, CMIM, and JMIM (See )? Why other search approaches such as the beam search approach is not used? If there is a reason for that, kindly reply to me.
Relevant answer
Answer
There is three main types of feature selection, filtering methods, wrapper methods, and embedded methods. Filtering methods use criteria based metrics that are independent to the modeling process and uses criteria such as mutual information, correlation or Chi square test to check each feature or a selection of features compared with the target. Other type of filtering methods includes variance thresholding and ANOVA. Wrapper methods uses error rates to help train individual models or subsets of features iteratively to select the critical features. Subsets can be selected Sequential Forward Selection, sequential backwards selection, bidirectional selection or randomally. With selecting features and training they are therefore more computationally expensive than filtering methods. There are heuristic approaches too such as Branch and Bound Search that are non exhausted searches. In some cases filtering methods are used before wrapper methods. Embedded methods includes use of decision trees or random forests for extracting feature importance for deciding which features to select. Overall feedforward, backward and bidrectional methods are stepwise methods for searching for crucial features. In regards to beam search which is more of a graph based heuristic optimization method that is similar to Best first search , that can be seen applied in neural network optimization or tree optimization rather than direct as a feature selection method.
  • asked a question related to Data Mining
Question
4 answers
Data mining has a broad discussion of how to manipulate data mining on other algorithms.
Relevant answer
Answer
Data mining is the process used to analyze data for relationships that have not previously been discovered, typically within existing large databases that work on mega data.
Moreover, there are four main vital properties of data mining which are;
I. Automatic Discovery of patterns
II. Prediction of likely outcomes
III. Creation of actionable information
IV. Focus on large data sets and databases
  • asked a question related to Data Mining
Question
3 answers
How can I distinctively differentiate between 'data mining', 'data analysis', and 'data analytics'?
Is there any example to add, towards proper understanding of the differences?
Thank you!
Relevant answer
Answer
Differences between data analytics and data mining (ironhack.com)
  • asked a question related to Data Mining
Question
8 answers
One of my master students is currently conducting a preliminary study to find out the maturity of the Cross Industry Standard Process for Big Data (CRISP4BigData) for use in Big Data projects. I would like to invite all scientists, Big Data experts, project managers, data engineers, data scientists from my network to participate in the following survey. Feel free to share!
Relevant answer
Answer
Done
  • asked a question related to Data Mining
Question
6 answers
I'm an undergraduate doing a Software Engineering degree. I'm looking for a research topic for my final year project. If anyone has any ideas or research topics or any advice on how or where to find one please post them.
Thanks in advance ✌
Relevant answer
Answer
Most of the SE based on Design and cost functions. Concentrate on
  • asked a question related to Data Mining
Question
1 answer
Is there an updated list of ?
Relevant answer
Answer
Im not sure what you mean by ’approved’ databases. Approved by what/who?
  • asked a question related to Data Mining
Question
2 answers
Hi, Could you please guide me how to conduct Latent Semantic Analysis through text mining for my business research, any website, book or tutorial videos? so I can apply this method for my research project. Thanks in advance. Kind regards Bushra Aziz
Relevant answer
Answer
Text Analytics Toolbox of MATLAB maybe suitable for your task. In practice, it is more friendly to beginners compared with Python tools. On the official website and its help centre, tutorial materials are provided in the manner of step by step. As well, some videos you can find on Youtube about it.
  • asked a question related to Data Mining
Question
4 answers
Hi,
Thank you for help.
How to make the scheduling process in CloudSim an environment for my reinforcement learning model ?
Relevant answer
Answer
Thank you for sharing the links and papers, I will use them to learn.
I appreciate your time and efforts
Best Regards,
Bashar
  • asked a question related to Data Mining
Question
9 answers
I am looking for a justification to associate data mining with big data analytics, however, many researches have observed that in addition to the characteristics of the data, there is a line of thought that guides a question of taxonomy, that is, data mining is a step in the big data analytics, can I think of it this way? Or is there something I'm not considering?
Relevant answer
Answer
please refer literature relevant to study
  • asked a question related to Data Mining
Question
7 answers
Modern politics is characterized by many aspects which were not associated with traditional politics. Big data is one of them. Data mining is being done by political parties as they seek help from data scientists to arrive at various patterns to identify behavior of voters. Question is, what are the various ways in which big data is being used by modern political parties and leaders?
Relevant answer
Answer
Big Data platforms allow government agencies to access large volumes of information that are essential for their daily operations. With real-time access, governments can identify areas that require attention, make better and more timely judgments about how to proceed, and enact the necessary changes.
  • asked a question related to Data Mining
Question
3 answers
I require some suggestions and need a health insurance dataset where text mining can be possible.Any recent papers addressing dataset can be helpful
Relevant answer
Answer
Dear Anuradha,
Please check the following link:
  • asked a question related to Data Mining
Question
7 answers
I have a data set that contains a text field for approximately more than 3000 records, all of which contain notes from the doctor. I need to extract specific information from all of them, for example, the doctor's final decision and the classification of the patient, so what is the most appropriate way to analyze these texts? should I use information retrieval or information extraction, or the Q and A system will be fine
Relevant answer
Answer
DEAR Matiam Essa
This text mining technique focuses on identifying the extraction of entities, attributes, and their relationships from semi-structured or unstructured texts. Whatever information is extracted is then stored in a database for future access and retrieval.The famous technique are:
Information Extraction (IE)
Information Retrieval (IR)
Natural Language Processing
Clustering
Categorization
Visualization
With the increasing amount of text data, effective techniques need to be employed to examine the data and to extract relevant information from it. We have understood that various text mining techniques are used to decipher the interesting information efficiently from multiple sources of textual data and continually used to improve text mining process.
GOOD LUCK
  • asked a question related to Data Mining
Question
7 answers
Which tools solves prediction problems effectively other than python based ?
Relevant answer
Answer
Totally we can't say which tool is the best one since it depends on data type and every person. Here you can find some of them:
I prefer Orange Data Mining. It is a FREE and opensource data visualization, machine learning, and data mining toolkit.
  • asked a question related to Data Mining
Question
3 answers
Problem statement: Google Trend Analysis and Paradigm Shift of Online Education
Platforms during the COVID-19 Pandemic
I would like to know what methodoligies, Data preprocessing techniques methods ,data mining methods,metrics used for this Analysis.
Relevant answer
Good morning
I invited you to see SCOPUS and Web of Sciences database
Best regards
Ph.D., MBA Ingrid del Valle García Carreno
  • asked a question related to Data Mining
Question
4 answers
How do data mining researchers test or evaluate their data mining model's EFFICIENCY?
or an ISO cert evaluation?
The model created is an output of the hypothesis and theory in my mind that I want to test so I unlikely want to use other people to evaluated the model like a system.
Since data mining evalation metrics alone can not be use to support the study.
I am searching for a study/research of way I can back up my study for the efficancy of the model created.
Feel free to educate me. I would love to hear your thoughts.
Relevant answer
Answer
  • asked a question related to Data Mining
Question
13 answers
Hi everybody,
I would like to do part of speech tagging in an unsupervised manner, what are the potential solutions?
Relevant answer
Answer
  • asked a question related to Data Mining
Question
3 answers
Please suggest R packages and codes for text ming (or any other programming) to search pubmed database.
Relevant answer
Answer
Ajit Kumar Singh Enter a free text search into the PubReMiner tool, and it will search PubMed for results. The program analyzes these data and generates tables that rank the frequency of terms in the articles' titles and abstracts, as well as related MeSH categories.
  • asked a question related to Data Mining
Question
11 answers
Data Mining (DM) is a process of extracting and discovering patterns in large data sets including methods of Machine Learning (including Deep Learning and Statistical Learning), Statistics, and Database Systems.
Machine Learning (ML) is the study of computer algorithms that improve automatically through experience and by the use of data.
It would seem very simplistic to consider the ML only as a part of the larger field of the DM.
From a very rough and general point of view, DM and ML are part of the mathematics.
From another point of view, more precise but more obsolete, they are both seen as a part of Artificial Intelligence.
I would like to propose to consider both disciplines as overlapping for most of their methods.
Do you have at least 3 differences between DM and ML to report?
Relevant answer
Answer
I think machine learning facilitate data mining. As such, we may say that ML algorithms are just tools for data mining.
  • asked a question related to Data Mining
Question
2 answers
Dear Madam, Please advise about post Doc supervisors in the university in the field of educational data mining and learning analytics for strengthening university decision making. I will be grateful
  • asked a question related to Data Mining
Question
11 answers
how to measure classification errors using weka. can we take the value of RSME or etc to utilize for taken the classification rate?
Relevant answer
Answer
I have been teaching myself how to use RWeka, specifically so that I may implement the M5P model. I have been able to use apply to my data, but do not understand what the percentage represents. For example, the beginning of the sample output from RWeka's manual is:
M5 pruned model tree: (using smoothed linear models) CHMIN <= 7.5 : LM1 (165/12.903%)
The other LMs have other "scores" like this, like (6/18.551%) and (23/48.302%). What exactly do these percentages and numbers represent?
  • asked a question related to Data Mining
Question
13 answers
I'm searching about autoencoders and their application in machine learning issues. But I have a fundamental question.
As we all know, there are various types of autoencoders, such as ​Stack Autoencoder, Sparse Autoencoder, Denoising Autoencoder, Adversarial Autoencoder, Convolutional Autoencoder, Semi- Autoencoder, Dual Autoencoder, Contractive Autoencoder, and others that are better versions of what we had before. Autoencoder is also known to be used in Graph Networks (GN), Recommender Systems(RS), Natural Language Processing (NLP), and Machine Vision (CV). This is my main concern:
Because the input and structure of each of these machine learning problems are different, which version of Autoencoder is appropriate for which machine learning problem.
Relevant answer
Answer
Look the link, maybe useful.
Regards,
Shafagat
  • asked a question related to Data Mining
Question
1 answer
I am very new to these forecasting methods. Can someone help me with how to forecast the next period using these methods?
I have weekly demand data where I classified them into lumpy, erratic, and smooth demands. As Croston's forecasting method is the best suited for smooth and SBA method for lumpy, I require their forecasting process to plan the demand for the next weekly period.
Is there any other method to forecast lumpy and smooth demands other than this method?
Thank you in advance
  • asked a question related to Data Mining
Question
4 answers
I am looking a free of charge International Conference in metaheuristic algorithm or data mining issue, is there ay one can help me?
Relevant answer
Sikirat Aina thanks.
  • asked a question related to Data Mining
Question
5 answers
I have past 4 years of weekly demand data. There are various products with their demand values. I am trying to calculate the future weekly values for a year. The data doesn't follow any trend and it is random. There are many weeks with Zero demands too.
I am very new to time series analysis. Can someone help me in suggesting an appropriate method?
Relevant answer
Answer
1-st of all, the fundamental assumption of any forecasting technique (implicit or explicit) is that time series represents a stable pattern that can be identified and then extended into the future. If a pattern of the past data-points is not statistically stable or random (as is in your case), then no meaningful future prediction (forecasting) is possible regardless of the sophistication of the forecasting technique.
Because your time series data points are random with no trend, your best bet is generating other random points from your current data distribution and treat these new random points as your forecast. You could build a histogram of your existing data points and generate new random points from this histogram.
  • asked a question related to Data Mining
Question
7 answers
I have some Key Informant Interview (KII) data. I want to apply Natural Language Processing (NLP) to identify the pattern in the data. Can applying NLP for analyzing KII be mentioned as data analytics tools in the report/paper?TIA
Relevant answer
Answer
Of course, it is an interesting work. For example, (1) using NER (Named Entity Recognition), RE (Relation Extraction) to construct Knowledge Graph, then analyzing the relations between the interviewees or the knowledge constitution of an interviewee ; (2) using EE (Event Extraction) to identify the event correlation between the questions and answers; (3)using SA (Sentiment Analysis) to analyze the attitudes toward to the interviewer or the company, etc; (4) using topic models to analyze the topics about the interview and finding out which topic the interviewers are most interested in; etc.
Many,many interesting jobs you can do by using NLP analysis. Wish you finished an interesting paper in some days.
  • asked a question related to Data Mining
Question
4 answers
I have compiled a list of lecture note, examples, and notes from Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems). The attached pdf is the first iteration of the text at this point it is just a manuscript. I would appreciate feedback on how to organize and structure the text in a way that it could be presented to a publisher.
Relevant answer
Answer
Przemysław Dolata Thank you for your advice. There are many PhD programs available. I am currently applying to several. I would love to get advice on your strategy for reading and analyzing texts.
  • asked a question related to Data Mining
Question
9 answers
What will be the future applications of analytics of large data sets conducted in the computing cloud on computerized Business Intelligence analytical platforms in Big Data database systems in enterprise logistics management?
The analytics conducted on computerized Business Intelligence platforms is one of the key advanced information technology technologies of the fourth technological revolution, known as Industry 4.0. The current technological revolution described as Industry 4.0 is determined by the development of the following technologies of advanced information processing: Big Data database technologies, cloud computing, machine learning, Internet of Things, artificial intelligence, Business Intelligence and other advanced data mining technologies.
The analytics conducted on computerized Business Intelligence platforms currently supports business management processes, including logistics management.
In my opinion, the use of analytics of large data sets conducted in the computing cloud on computerized Business Intelligence analytical platforms in Big Data database systems in enterprise logistics management, including supply logistics, production logistics, provision of services and distribution of manufactured products and services, is currently growing.
The analytics conducted on large data sets conducted in the cloud computing on Business Intelligence computerized platforms in Big Data database systems makes it particularly easy to identify opportunities and threats to business development, allows for quick generation of analytical reports on selected issues in the economic and financial situation of the business entity. In this way, the generated reports can be helpful in the processes of enterprise logistics management, including supply logistics, production logistics, provision of services and distribution of manufactured products and services.
Do you agree with my opinion on this matter?
In view of the above, I am asking you the following question:
What will be the future applications of analytics of large data sets conducted in the computing cloud on computerized Business Intelligence analytical platforms in Big Data database systems in enterprise logistics management?
Please reply
I invite you to the discussion
The issues of the use of information contained in Big Data database systems for the purposes of conducting Business Intelligence analyzes are described in the publications:
I invite you to discussion and cooperation.
Best wishes
Relevant answer
Answer
It is rising field since intelligence and in general artificial intelligence becomes the dominant technology of current era
  • asked a question related to Data Mining
Question
12 answers
i am doing project on automated classification of software requirement sing NLP and machine learning approach i.e. Naive Bayes. For this i require dataset of classified software requirements. i have searched PROMISE data repository, but didnot find dataset according to my need. can someone help me it will be highly appreciated if someone tell me from where i can find and download this dataset.
Relevant answer
Answer
The PROMISE dataset is here: https://doi.org/10.5281/zenodo.268542
The PURE dataset is here: https://doi.org/10.5281/zenodo.1414117
  • asked a question related to Data Mining
Question
6 answers
dear community, I need your help regarding extracting data from the Binance platform in order to use it for a forecasting problem , for example we extract data about a certain crypto then we clean it and make it ready for use and make a forecast if we should buy it or not with adding an alarm when the time is perfect for that, using python and machine learning and statistics.
  • asked a question related to Data Mining
Question
23 answers
Hi everyone
I'm looking for a quick and reliable way to estimate my missing climatological data. My data is daily and more than 40 years. These data include the minimum and maximum temperature, precipitation, sunshine hours, relative humidity and wind speed. My main problem is the sunshine hours data that has a lot of defects. These defects are diffuse in time series. Sometimes it encompasses several months and even a few years. The number of stations I work on is 18. Given the fact that my data is daily, the number of missing data is high. So I need to estimate missing data before starting work. Your comments and experiences can be very helpful.
Thank you so much for advising me.
Relevant answer
Answer
It is in French
  • asked a question related to Data Mining
Question
14 answers
Cluster analysis, classification, Data Mining
Relevant answer
Answer
Grouping related data according to categories or themes. These are based on inter-relations between the variables which can influence each other in the respective setting.
  • asked a question related to Data Mining
Question
3 answers
Hi Fellows,
The matrix is here at the bottom: https://statweb.stanford.edu/~jtaylo/courses/stats202/visualization.html. A similar version is seen on the book Introduction to Data Mining. It's clear that colours toward the red end indicate stronger correlation, but what attributes or variables are really correlated as shown? For example, along the main diagonal, cases of the same species show mostly perfect correlation, with a few near-perfect occurrences. Normally, a correlation is calculated with two columns of values, not two single cases.
Thanks
RP
  • asked a question related to Data Mining
Question
4 answers
I am working on a data mining project and would like to portray the correlation between healthcare expenditure by country and the population's life expectancy/general health and am having trouble finding sizeable data sets.
Relevant answer
Answer
Healthcare expenditures: http://wdi.worldbank.org/table/2.12
Here's the full list of indicators: http://wdi.worldbank.org/table
  • asked a question related to Data Mining
Question
4 answers
Hi
How can new data mining methods be used to assess the ecological potential of the land?
Relevant answer
Answer
using algorithms of machain learning
  • asked a question related to Data Mining
Question
10 answers
I am passionate for working on medical data. but unfortunately the disease on which I want to work, I couldn't find data in my home country. Anyone Up from medical informatics and health data mining who can collaborate with me?
Relevant answer
Answer
Please have look on our(Eminent Biosciences (EMBS)) collaborations.. and let me know if interested to associate with us
Our recent publications In collaborations with industries and academia in India and world wide.
EMBS publication In association with Universidad Tecnológica Metropolitana, Santiago, Chile. Publication Link: https://pubmed.ncbi.nlm.nih.gov/33397265/
EMBS publication In association with Moscow State University , Russia. Publication Link: https://pubmed.ncbi.nlm.nih.gov/32967475/
EMBS publication In association with Icahn Institute of Genomics and Multiscale Biology,, Mount Sinai Health System, Manhattan, NY, USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29199918
EMBS publication In association with University of Missouri, St. Louis, MO, USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30457050
EMBS publication In association with Virginia Commonwealth University, Richmond, Virginia, USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27852211
EMBS publication In association with ICMR- NIN(National Institute of Nutrition), Hyderabad Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/23030611
EMBS publication In association with University of Minnesota Duluth, Duluth MN 55811 USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27852211
EMBS publication In association with University of Yaounde I, PO Box 812, Yaoundé, Cameroon. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30950335
EMBS publication In association with Federal University of Paraíba, João Pessoa, PB, Brazil. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30693065
Eminent Biosciences(EMBS) and University of Yaoundé I, Yaoundé, Cameroon. Publication Link: https://pubmed.ncbi.nlm.nih.gov/31210847/
Eminent Biosciences(EMBS) and University of the Basque Country UPV/EHU, 48080, Leioa, Spain. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27852204
Eminent Biosciences(EMBS) and King Saud University, Riyadh, Saudi Arabia. Publication Link: http://www.eurekaselect.com/135585
Eminent Biosciences(EMBS) and NIPER , Hyderabad, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29053759
Eminent Biosciences(EMBS) and Alagappa University, Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30950335
Eminent Biosciences(EMBS) and Jawaharlal Nehru Technological University, Hyderabad , India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/28472910
Eminent Biosciences(EMBS) and C.S.I.R – CRISAT, Karaikudi, Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30237676
Eminent Biosciences(EMBS) and Karpagam academy of higher education, Eachinary, Coimbatore , Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30237672
Eminent Biosciences(EMBS) and Ballets Olaeta Kalea, 4, 48014 Bilbao, Bizkaia, Spain. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29199918
Eminent Biosciences(EMBS) and Hospital for Genetic Diseases, Osmania University, Hyderabad - 500 016, Telangana, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/28472910
Eminent Biosciences(EMBS) and School of Ocean Science and Technology, Kerala University of Fisheries and Ocean Studies, Panangad-682 506, Cochin, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27964704
Eminent Biosciences(EMBS) and CODEWEL Nireekshana-ACET, Hyderabad, Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/26770024
Eminent Biosciences(EMBS) and Bharathiyar University, Coimbatore-641046, Tamilnadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27919211
Eminent Biosciences(EMBS) and LPU University, Phagwara, Punjab, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/31030499
Eminent Biosciences(EMBS) and Department of Bioinformatics, Kerala University, Kerala. Publication Link: http://www.eurekaselect.com/135585
Eminent Biosciences(EMBS) and Gandhi Medical College and Osmania Medical College, Hyderabad 500 038, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27450915
Eminent Biosciences(EMBS) and National College (Affiliated to Bharathidasan University), Tiruchirapalli, 620 001 Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27266485
Eminent Biosciences(EMBS) and University of Calicut - 673635, Kerala, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/23030611
Eminent Biosciences(EMBS) and NIPER, Hyderabad, India. ) Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29053759
Eminent Biosciences(EMBS) and King George's Medical University, (Erstwhile C.S.M. Medical University), Lucknow-226 003, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/25579575
Eminent Biosciences(EMBS) and School of Chemical & Biotechnology, SASTRA University, Thanjavur, India Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/25579569
Eminent Biosciences(EMBS) and Safi center for scientific research, Malappuram, Kerala, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30237672
Eminent Biosciences(EMBS) and Dept of Genetics, Osmania University, Hyderabad Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/25248957
EMBS publication In association with Institute of Genetics and Hospital for Genetic Diseases, Osmania University, Hyderabad Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/26229292
Sincerely,
Dr. Anuraj Nayarisseri
Principal Scientist & Director,
Eminent Biosciences.
Mob :+91 97522 95342
  • asked a question related to Data Mining
Question
11 answers
Hi There!
My data has a number of features (with contain continuous data) and a response feature (class label) of categorical data (binary). My intention is to study the variation of the response feature (Class ) due to all the other features using a variety of feature selection techniques. Kindly help in pointing out right techniques for the purpose. Data is like this:
------------------------------------------------------------------
f1 f2 f3 f4 ... fn class
------------------------------------------------------------------
0.2 0.3 0.87 0.6 ... 0.7 0
0.2 0.3 0.87 0.6 ... 0.7 1
0.2 0.3 0.87 0.6 ... 0.7 0
0.2 0.3 0.87 0.6 ... 0.7 1
-------------------------------------------------------------------
Relevant answer
Answer
You can select the best algorithm based on the measure of performance from a number of data mining algorithms. A exhaustive list may be found: https://www.kdnuggets.com/2015/05/top-10-data-mining-algorithms-explained.html
  • asked a question related to Data Mining
Question
4 answers
I think that Generative Adversarial Networks can be used as Data Farming Means. What do you know about such an approach? Can you give another example of means for Data Farming?
Relevant answer
Answer
Other approaches exist. For instance (and mostly application based)
Extreme Data Mining.
A strategy to apply machine learning to small datasets in materials science.
Machine learning on small size samples: A synthetic knowledge synthesis.
  • asked a question related to Data Mining
Question
4 answers
Why Particle Swarm Optimization works better for this classification problem?
Can anyone give me any strong reasons behind it?
Thanks in advance.
Relevant answer
Answer
Arash Mazidi PSO is also in various classification problems. I particularly use it for Phishing website datasets.
  • asked a question related to Data Mining
Question
4 answers
How many respondents are really enough?
There are two schools of thought about sample size a relatively small sample size is adequate. Perhaps 300-500 respondents can work?
Relevant answer
Answer
What is the best number of respondents when conducting a research?
There are two schools of thought about sample size – one is that as long as a survey is representative, a relatively small sample size is adequate. Perhaps 300-500 respondentscan work. The other point of view is that while maintaining a representative sample is essential, the more respondents you have the better.
Regards,
Shafagat
  • asked a question related to Data Mining
Question
7 answers
Please share the paper and throw the light on text mining and meta analysis
Relevant answer
Answer
Here's a link of my meta-analysis paper
  • asked a question related to Data Mining
Question
9 answers
Let consider there is a selling factor like this:
Gender | Age | Street | Item 1 | Count 1 | Item 2 | Count 2 | ... | Item N | Count N | Total Price (Label)
Male | 22 | S1 | Milk | 2 | Bread | 5 | ... | - | - | 10 $
Female | 10 | S2 | Cofee | 1 | - | - | ... | - | - | 1 $
....
We want to predict the total price for a factor based on their buyer demographic information (like gender, age, job) and also their buying items and counts. It should be mentioned that we suppose that we don't know each item's price and also, the prices will be changed during the time (so, we although will have a date in our dataset).
Now it is the main question that how we can use this dataset that contains some transactional data (items) which their combination is not important. For example, if somebody buys item1 and item2, it is equal to other guys who buy item2 and item1. So, the values of our items columns should not have any differences for their value orders.
This dataset contains both multivariate and transactional data. My question is how can we predict the label more accurately?
Relevant answer
Answer
Hi Dr Behzad Soleimani Neysiani . I agree with Dr Qamar Ul Islam .
  • asked a question related to Data Mining
Question
4 answers
For example, k-nearest neighbor needs to compute the smallest one of distances between a query and a large number of data.
But, k-means clustering computes the smallest one of distances between each data and a few cluster center.
Like k-nearest neighbor, which technique requires to compute the maximum or minimum value in a large number of data?
Relevant answer
Answer
I recommend reading the following paper as it contains useful information to answer yoru question:
  • asked a question related to Data Mining
Question
3 answers
I want to understand C5.0 algorithm for data classification , is there any one have the steps for it or the original paper that this is algorithm was presented in ?
Relevant answer
  • asked a question related to Data Mining
Question
6 answers
What is the best algorithm to complement a cluster analysis (k-means) and define the ideal cluster number? I am testing the Weka data mining application, which incorporates clustering algorithms that do not require prior selection of the number of clusters. Has anyone tried it?
  • asked a question related to Data Mining
Question
12 answers
Hello everybody
I am solving a Social Network Analysis problem. I have 9 centrality measures in my problem and I am trying to combine them for creating a new centrality measure.
I have chosen TOPSIS as a combining method. Now I am looking for an easy method to assign appropriate weights to my criteria.
If you think you can help me and even introduce me to a better solution than TOPSIS, I will be glad if you share it with me.
Best Regards
Relevant answer
Answer
I suggest using entropy derived weights that are objective
  • asked a question related to Data Mining
Question
5 answers
I have seen City Pulse (see link) and they have the type of data I'm looking for, but not in large enough quantity. In the best case, the data will have recording intervals that are < 1 hour (the more frequent, the better) and have total duration of at least a month. 
Relevant answer
Answer
i need a monthly dataset for water usage...
  • asked a question related to Data Mining
Question
4 answers
I would like to carry out a study (Social-Economical Categorization) on multi datasets (text data from ISPs, hospitals, Government records agencies ) using any suitable data mining technique. I read that WEKA can do the job. I am still a newbie when it comes to data mining analysis and WEKA. Kindly advise on how best I can do this.
  • asked a question related to Data Mining
Question
4 answers
what procedure and data should I use ?
how to structure the empirical study ?
Relevant answer
Answer
You may find this paper useful:
Stagnaro, M. N., Arechar, A. A., & Rand, D. G. (2017). From good institutions to generous citizens: Top-down incentives to cooperate promote subsequent prosociality but not norm enforcement. Cognition, 167, 212–254.
  • asked a question related to Data Mining
Question
3 answers
I usually use Latent Dirichlet Allocation to cluster texts. What do you use? Can someone give a comparison between different text clustering algorithms?
Relevant answer
Answer
I typically have used k-means clustering algorithm which is very popular. This algorithm is based on partitioning. Similarly you can use clustering algorithms based on density or hierarchical clustering methods.
  • asked a question related to Data Mining
Question
4 answers
Can you suggest any topic related to Big Data + Data Mining + Association Rule Mining + Predicting Consumer Behaviors
Relevant answer
Answer
I found several hits when I type this into search engines, but likely you found not all the keywords were found simultaneously?
I could suggest one article to consider if only for your literature review as it covers a lot,
Strang, K. D., & Sun, Z. (2017). Scholarly big data body of knowledge: What is the status of privacy and security? Annals of Data Science, 4(1), 1-17. http://dx.doi.org/10.1007/s40745-40016-40096-40746.
  • asked a question related to Data Mining
Question
9 answers
What are the various query based (Top-K Frequent Pattern Mining) techniques are being used for various purposes. So i need to know what are some new research trends in Data Mining.
Relevant answer
Answer
Causal inference will be the next frontier (4-10 years ) in AI, machine learning and modeling.
  • asked a question related to Data Mining
Question
8 answers
I'm looking for finding frequent itemsets in sequences, which means the order of appearance of items matters in itemsets. Consider the following example :
1,2,3
1,3,2
3,1,2
Assume that the order of items matters, then if we put min_support = 3, {1,2} is frequent, because support({1,2})= 3 and every time we see {1,2} in this dataset, 2 comes after 1.
Let's consider {1,3}, we know that this itemset appears 3 times in our dataset, but is not frequent, because only in 2 transactions 3 comes after 1.
I'm looking for an algorithm that can do this for me, I found algorithms like GSP which do something similar to what i want, but they don't do exactly what i wanted to do. Can you please recommend me an algorithm which is able to find such frequent itemsets?
Thanks in advance
Relevant answer
Answer
Philippe Fournier Viger I have read your surveys when i was working on my thesis, it helped me a lot and guided me to complete my research. By the way SPMF is amazing! I have seen it and i worked with it in the period of my research.
I think the problem that i have described is a little bit different than known sequential pattern mining algorithms. That's why i decided to ask it here.
Thanks in advance for your answer.
  • asked a question related to Data Mining
Question
5 answers
Can anyone help me find a tool that allows me to download the old tweets in the history of a user. I need to study the content of the tweets of 2011 from a group of users who used a # hashtag.
Relevant answer
Answer
Hi Oscar,
You can use Trackmyhashtag to download old Tweets. It is a Twitter analytics tool that can track any hashtags or topic and provides useful detailed analytics in real-time as well as historical.
It can help you to download old Tweets of any user since 2006. You will get complete tweet detail including total tweets, tweet content, contributors detail who contributed to that specific tweet, and other lots of useful metrics.
Here you will find a "Request data" form which you have to fill with your search term and select the dates and submit. After a while, you'll receive your Tweets data and other details with it.
Thank you:)
  • asked a question related to Data Mining
Question
9 answers
Is there a Python or R package for analyzing spreader nodes and community detection in the multilayer network?
Relevant answer
Answer
I worked on this specific topic in my dissertation, I tried many tools but eventually, I and my research team decided to design a simulator for static, dynamic, and multilayer networks using NetLogo multiagent programming. Below is the link to the simulator, it might be useful in your work.
Note that you just need to configure the simulator based on your needs.
Good Luck
  • asked a question related to Data Mining
Question
3 answers
Hello,
Does anyone know how to extract all twitter images under a specific hashtag using python or R? Any relevant packages?
Thank you,
Ioanna
PS I am not searching how to extract all images uploaded by a user.
Relevant answer
Answer
You can use the "Tweepy library" to extract Twitter data.
To learn how to use Tweepy library in python, check out the following link:
  • asked a question related to Data Mining
Question
3 answers
I'd like to use it in a classification task.
Relevant answer
  • asked a question related to Data Mining
Question
7 answers
I would like to dive into the research domain of explainable AI. What are some of the recent trending methodologies in this domain? What can be a good start to dive into this field?
Relevant answer
Answer
These Papers will help you:
1: Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers: https://arxiv.org/pdf/1801.06889.pdf
2: Visual Analytics for Explainable Deep Learning:
3: CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization :