Conference Guides Authors' Guide Publications Guide Publicity Guide Reviewers' Guide Sponsors' Guide
Authors & Publish
chair page, blog
Paper formatting
page, blog
Initial paper submission
chair, page, blog
Final paper submission
chair, page, blog
Problematic papers - corrections page, blog
Author [PDF,CrossCheck] tests page, blog IEEE PDF eXpress - paper format chair, page, blog IEEE electronic Copyright (eCf) chair, page, blog Attendee downloads of papers page, blog
Conference registration
page, blog
Travel visas to Hungary
page, blog
Conference presentations
page, blog
HELP contacts
WCCI2020, system
Non-Author actions Paper reviews - authors' perspective page, blog IEEE CrossCheck text similarity chair, page, blog IEEE Xplore web-publish chair, page, blog
IEEE Conference Application chair IEEE Letter of Acquisition chair IEEE Publication Form chair Software systems page
Conference Guides for [IEEE-CIS, INNS] conferences : Although this guide was initially set up for IJCNN2019, and was used for IEEE WCCI 2020, hopefully some of the basic information may still help [organising committee members, authors] up to 2023-2025 (assuming a half-life of links and process information of 3-5 years?). The Authors' and Publications menus have been combined, allowing authors to see the Publications Chair perspective as well. I am no longer keeping this up to date. Bill Howell, 11Dec2020

IEEE CrossCheck - understanding the text-similarity checks


"...   IEEE policy requires that all accepted papers must be checked for text-similarity.
... The IEEE CrossCheck Portal is available to all conference organizers and periodical editors to help screen manuscripts for plagiarized material. The IPR Office partnered with the IEEE Publications Technology department to develop the CrossCheck Portal as a stand-alone web application that can be used by any publications volunteer at any time.
... Conference organizers are encouraged to use CrossCheck as early as possible during the manuscript review process.    ..."

Authors ARE involved in this process! They MUST be aware of the criteria

Table of contents :

WARNINGS and recommendations

29Jan2019 Howell note : We have lost far too many great papers in an entirely avoidable fashion, as well as upset many authors (as with IJCNN2017). However, it's too late for IJCNN2019 for the ideas below. They relally need to be warned, and have access to the tools at least by the time that paper submissions are first opened.

03Mar2019 Howell note : Do NOT use the word "plagiarism"!! That is completely inaccurate as explained below and in the "Authors' Guide : IEEE CrossCheck blog". Not only is this wrong, it is upsetting to authors and they will [probably, justifiably] complain. I now use the expression "[self, external] text similarities".

Also, I strongly encourage making the quantitative criteria for paper rejections on the basis of text similarities clear, including provision of the formula used for screeening. I have posted the actual CrossCheck results publicly (coded by paper submission number, no author names) as you can find via the links in the "Authors' Guide : IEEE CrossCheck [author, blog]" web-pages. This was of great assistance to authors, as they could see how their paper ranked compared to others, and they can even run their own assessment to see if their rejection was fair based on the criteria.

Background information

The "Authors' Guide : IEEE CrossCheck Portal - understanding the text-similarity checks" provides : This material will not be repeated here.

eLearning Course: CrossCheck for Conferences - As part of the MCE Conference Education Program, learn how to access and utilize CrossCheck, a plagiarism detection tool. This eLearning course is available at no cost to IEEE conference organizers and offers an opportunity to earn a certificate for professional development hours.

IEEE CrossCheck Portal Guide (PDF, 3.2 MB) - IEEE CrossCheck User's Guide provides step-by-step instructions for using the CrossCheck detection service. It also helps users review and interpret CrossCheck Similarity Reports and explain what actions to take.

General comment - Identification of potential text-similarity, decisions and actions

CrossCheck is a great tool for highlighting papers that warrant further investigation. It is NOT a stand-alone or complete tool for making a decision, and in many cases we have to rely on the reviewers to catch serious cases. For example, even though authors may have copied [theoretical developments, analysis, results] for published papers, if they change the wording and rearrange equations, their "Similarity %" may be quite low. By relying too heavily on CrossCheck, you might be simply weeding out the sloppy transgressors.

Uploading conference papers to CrossCheck

1. From the conference paper system (Tomasz Cholewo), save files in batches of 100-150 papers to an appropriately names zip file (maximum 200 Mbyte zip file size). I didn't succeed in doing this for IJCNN2019, so it was done by the General Co-Chair. Note that you can do this file-by file, but it takes a long time for a few hundred papers, believe me!
2. Upload the zip file to CrossCheck : As indicated by the "UPLOAD" to the right of the conference publication name on the initial page that you see after logging in. As CrossCheck reminds you, this takes 4-8 hours!

Analysis of CrossCheck results

Start with the IEEE tutorial and guide (links in the Background section above), which explain how to get into the system, upload [individual, zip file of <= 200 Mbytes (150 papers)] papers, and do a range of tasks in the system.

As a rough estimate, I took 15-30 minutes per paper analysis, after spending ~2/3 hour per batch of 100 papers for [zip download, upload to CrossCheck], which doesn't including a zip file of a batch of papers. Both estimates are highly variable, and could improve over time as one becomes familiar with the processes and CrossCheck.

My approach is (very generally) for a paper analysis : This compact list should NOT be taken as a rule, but should be [criticised, corrected, improved]!

CrossCheck rejection criteria

As a rough guide of how this was applied for IJCNN2019 :

Examples of detailed CrossCheck analysis notes for a batch of papers

Here is an example report of detailed analysis of a CrossCheck results for a batch of files, including a summary of recommendations for each paper, grouped into "recommended actions". In this case, only papers with CrossCheck results >=50% similarity were covered, as the first stage of analysis. The same approach is used for papers with (30% < similarity <=50%), but given the huge number of such papers, it is only practical to do a sampling of papers, perhaps 5-10 in a hundred.

For privacy reasons, [authors' names, paper titles] have been replaced by one of :

<.IJCNN paper authors' names.>
<.IJCNN paper title.>

<.Other paper authors' names.>
<.Other paper title.>

The example from IJCNN2017 is provided in this image - short commented results of the analysis in all targeted cases (right-click to open in a new tab or window) : CrossCheck IJCNN 2017 analysis of close checks

These examples also bring out cases where "plagiarism" isn't really a problem (the [course notes, PhD thesis] examples), and how "self-text-similarity" is actually the major portion of CrossCheck results - noting that excessive self-text-similarity is still reason for rejection. All author names and information that might betray privacy has been removed.

Notification to authors of paper rejection, explanation of CrossCheck results

Notification of rejection on the basis of CrossCheck [results, analysis] involves :

Generate a summary listing of all CrossCheck results

While this is not necessary, it is handy to have an overall summary list of CrossCheck results. This allows easy comparison of summary CrossCheck results and [accept, reject] decisions, as well as quick checks for papers that haven't been submitted to CrossCheck for analysis.

After ALL conference papers have been uploaded to CrossCheck (remember - it takes up to 8 hours for results to be available!) :

Opportunities for improvements to CrossCheck - [random, scattered] thoughts and questions

Question : Is a strong level of [diverse, fine-grained] [self, external] text-similarity acceptable?

Does a high diversity souces help "alleviate" the [self, external] text-similarity issue? Consider the case where most of the text similarities come from a high number of [the authors' previous, non-author] publications. Furthermore, assume that these are well-interspersed in the IJCNN paper.

A paper can have high overall similarity, even exceeding 50%, just from this type of situation, without having copies of entire paragraphs, and even where no source accounts for more than 2-5% similarity. Do we want to force somewhat low-level word-crafting by authors for cases like this "just for the sake of avoiding the perception of a repeat paper with nothing new"? Is that productive, does it help readers or remove the impression that they read the same paper before, is it required for copyright reasons?

Take the "extreme, trivial case" of single-word similarity. In that case, ALL papers would obviously fail and be rejected, as we are forced to use a language. What about two to five word phrases? Obviously, CrossCheck doesn't include these extremes, and seems to have a lower cut-off of something like 6-10 word sequences. Could you even communicated if you avoided common phrases used in a language? Perhaps CrossCheck does eliminate most common longer phrases from being included in the overall similarity rating, I don't know. From the IEEE CrossCheck Portal Guide :
"... Interpreting Matching Percentages of Individual Sources
It may seem that any source of matching text should be a concern, but in fact many matching sources are likely to not be the result of text-similarity. For example: ..."


So CrossCheck does suggest some degree of allowance, the question is how much and what criteria? It's important that suggestions be workable for volunteers who have never worked with CrossCheck before, and who don't want to spend a huge amount of time becoming experts on the analysis.

Another factor is the huge amount of time required to do the checks if done even at a basic level. That means we have much less time for looking at potential cases of real text-similarity (for example [concept, - perhaps no time at all if arXiv pre-publishing becomes common practice.

At present, papers are rejected at (30% < similarity) where there is a strong level of [diverse, fine-grained] [self, external] text-similarity. My own feeling is that at some level, a bit higher than the CrossCheck threshold, this could be acceptable?

[arXiv, aclweb]-like postings that are not currently allowed by the IEEE

(Note: thesis material considered non-published for CrossCheck) - www.groundai.com
- hal.archives-ouvertes.fr, tel.archives-ouvertes.fr
- biorxiv.org

Frequently Asked Questions (FAQs) The following selection from the
IEEE Support Center FAQs may be of help, and you can click the preceding link for additonal help. Links are as of 01Oct2018, and change occasionally. I don't know if you have to be an IEEE member to access these pages (I haven't checked). As usual, it often is easier to web-search using [Duckduckgo, Google, Bing, etc] "IEEE FAQ and What graphic file formats are supported by the IEEE Graphics Analyzer Tool?" (as an example). Readers may find [better, really useful] IEEE FAQs, or if they have their own advice to add to this, please email your [links,comments] to me.




Directory of available files for this webpage

Many thanks to our Sponsors & Exhibitors

Ongoing support : [IEEE, IEEE-CIS] for [CEC, FUZZ, IJCNN]; INNS for IJCNN; [IET, EPS] for CEC
2020 IEEE World Congress on Computational Intelligence, Glasgow, Scotland.....19-24 July 2020

2019 International Joint Conference on Neural Networks, Budapest, HUNGARY.....14-19 July 2019
Silver(mailto) Silver (mailto) Bronze (mailto) Bronze (mailto)