Conference Guides for [IEEE-CIS, INNS] conferences : Although this guide was initially set up for IJCNN2019, and was used for IEEE WCCI 2020, hopefully some of the basic information may still help [organising committee members, authors] up to 2023-2025 (assuming a half-life of links and process information of 3-5 years?). The Authors' and Publications menus have been combined, allowing authors to see the Publications Chair perspective as well. I am no longer keeping this up to date. Bill Howell, 11Dec2020
Authors & Publish |
Paper formatting |
Initial paper submission |
Final paper submission |
Problematic papers - corrections
Author [PDF,CrossCheck] tests
IEEE PDF eXpress - paper format
IEEE electronic Copyright (eCf)
Attendee downloads of papers
Conference registration |
Travel visas to Hungary |
Conference presentations |
HELP contacts |
Paper reviews - authors' perspective
IEEE CrossCheck text similarity
IEEE Xplore web-publish
IEEE Conference Application
IEEE Letter of Acquisition
IEEE Publication Form
IEEE CrossCheck - understanding the text-similarity checks
"... IEEE policy requires that all accepted papers must be checked for text-similarity.
Authors ARE involved in this process! They MUST be aware of the criteria
The IEEE CrossCheck Portal is available to all conference organizers and periodical editors to help screen manuscripts for plagiarized material. The IPR Office partnered with the IEEE Publications Technology department to develop the CrossCheck Portal as a stand-alone web application that can be used by any publications volunteer at any time.
... Conference organizers are encouraged to use CrossCheck as early as possible during the manuscript review process.
Table of contents :
29Jan2019 Howell note : We have lost far too many great papers in an entirely avoidable fashion, as well as upset many authors (as with IJCNN2017). However, it's too late for IJCNN2019 for the ideas below. They relally need to be warned, and have access to the tools at least by the time that paper submissions are first opened.
- Authors who have access to [iThenticate (CrossCheck), NTU's TurnitIn, etc] should use these tools BEFORE submitting their papers.
- What I really regret is NOT having posted a [free, simple] pre-check tool for authors EARLY on (as many authors do not have access to CrossCheck or other equivalents), long before the paper submission date. Although it may be easy to program something simple, it would take time, plus it may not be easy to get the minimum required [accuracy, reliability] for authors to be comfortable that they do comply.
- Maybe there is something already out there for that purpose?
- Maybe iThenticate has something to apply to a paper that has been written and a few related publications by the authors?
03Mar2019 Howell note : Do NOT use the word "plagiarism"!! That is completely inaccurate as explained below and in the "Authors' Guide : IEEE CrossCheck blog". Not only is this wrong, it is upsetting to authors and they will [probably, justifiably] complain. I now use the expression "[self, external] text similarities".
Also, I strongly encourage making the quantitative criteria for paper rejections on the basis of text similarities clear, including provision of the formula used for screeening. I have posted the actual CrossCheck results publicly (coded by paper submission number, no author names) as you can find via the links in the "Authors' Guide : IEEE CrossCheck [author, blog]" web-pages. This was of great assistance to authors, as they could see how their paper ranked compared to others, and they can even run their own assessment to see if their rejection was fair based on the criteria.
The "Authors' Guide : IEEE CrossCheck Portal - understanding the text-similarity checks" provides :
This material will not be repeated here.
- a description of the CrossCheck system
- IEEE's five levels of plagiarism
- a discussion about "External" versus "Self" plagiarism
- CrossCheck "Similarity %" breakdown for IJCNN2017 Anchorage papers
- Authors who spot plagiarism in IJCNN papers after the Proceedings are released
- IEEE publications bans on authors - who we must screen out
- Descriptions of CrossCheck, iThenticate and its "CrossRef"
eLearning Course: CrossCheck for Conferences - As part of the MCE Conference Education Program, learn how to access and utilize CrossCheck, a plagiarism detection tool. This eLearning course is available at no cost to IEEE conference organizers and offers an opportunity to earn a certificate for professional development hours.
IEEE CrossCheck Portal Guide (PDF, 3.2 MB) - IEEE CrossCheck User's Guide provides step-by-step instructions for using the CrossCheck detection service. It also helps users review and interpret CrossCheck Similarity Reports and explain what actions to take.
CrossCheck is a great tool for highlighting papers that warrant further investigation. It is NOT a stand-alone or complete tool for making a decision, and in many cases we have to rely on the reviewers to catch serious cases. For example, even though authors may have copied [theoretical developments, analysis, results] for published papers, if they change the wording and rearrange equations, their "Similarity %" may be quite low. By relying too heavily on CrossCheck, you might be simply weeding out the sloppy transgressors.
1. From the conference paper system (Tomasz Cholewo), save files in batches of 100-150 papers to an appropriately names zip file (maximum 200 Mbyte zip file size). I didn't succeed in doing this for IJCNN2019, so it was done by the General Co-Chair. Note that you can do this file-by file, but it takes a long time for a few hundred papers, believe me!
2. Upload the zip file to CrossCheck : As indicated by the "UPLOAD" to the right of the conference publication name on the initial page that you see after logging in. As CrossCheck reminds you, this takes 4-8 hours!
- Log onto CrossCheck
- username : You must be setup in CrowwCheck with the same email address as you use for your IEEE Web Account. The email address must all be in lowercase, I think.
- password : is the same as you use for your IEEE Web Account, as a member of IEEE
- After you log in, you should see on the first page that shows up the "Publication Title : 2019 International Joint Conference on Neural Networks (IJCNN)"
- Click on the "Upload" link to the right of the PubTitle.
- if you can't upload the zip file, what is the name of the zip file? CrossCheck will choke if there are characters like "-_()" etc - make sure that ONLY alphanumeric characters (and the period for the file extensip zip) are used - if not, change the zip filename.
- If the file starts to upload, you will get a window pop-up (make sure that pop-ups are enabled!) that says processing the file. Uploading 100 file-zips takes a long time : 20 to 40 minutes! So be patient and wait until that pop-up window confirms that the file has uploaded.
- Log out of Cross Check when the file has uploaded.
- You have to wait 4-8 hours for the uploaded zip files to be processed.
- After the wait time, check again to see if the files are listed when you click on the "Publication Title : 2019 International Joint Conference on Neural Networks (IJCNN)".
Start with the IEEE tutorial and guide (links in the Background section above), which explain how to get into the system, upload [individual, zip file of <= 200 Mbytes (150 papers)] papers, and do a range of tasks in the system.
As a rough estimate, I took 15-30 minutes per paper analysis, after spending ~2/3 hour per batch of 100 papers for [zip download, upload to CrossCheck], which doesn't including a zip file of a batch of papers. Both estimates are highly variable, and could improve over time as one becomes familiar with the processes and CrossCheck.
My approach is (very generally) for a paper analysis :
This compact list should NOT be taken as a rule, but should be [criticised, corrected, improved]!
- Open CrossCheck and click on your "Conference name" to bring up a list of CrossCheck summary results for that conference. These are listed in order of decreasing overall similarity %.
- Find the next paper that you must analyse, starting from the top of the list, generally papers with >= 50% similarity.
- I type out key information for each paper that I analyse in a text file containing all of my CrossCheck results (example below). This provides an auditable record of my analysis (proof of a check), and is very handy if I go back to a paper, or especially if others are to follow-up on my analysis or use my results to [accept, reject] a paper. Add a "Start template" for that paper, adapting the basic information to the paper being analysed : Add the [paper ID, overall % similarity] by copy-pasting from the CrossCheck list. Get the authors and titles from a spreadsheet of "Papers : view" output of the IJCNN paper review system, and paste to the paper analysis.
- Click on the :"Review Report" link of the CrossCheck summary list. This will bring up a separate [window, tab] for that paper.
- I start at the "root" level of the "Match Overview" panel at the right of the CrossCheck analysis page for a given paper. You can tell if you are at the root level, as otherwise a left-arrow appears to the left of the words "Match Overview". If the left-arrow appears, click it to get to the root level.
- While at the root level of the "Match Overview" panel, browse through the paper as shown in the left panel. Color coding and numbers indicate sources of similar text. Fill in the "Overall view" rough, eyeballed] estimates of the %similarity for each part of the paper, including comments as to whether whole paragraphs or sections are copies, or whether the similarities are finely disseminated from many sources. Sometimes (as per the example given below), I type the fraction of each page that is covered by text, and the portion of the text that has a colored background (and therefore this text appears in other sources). By a straightforward calcualtion, my own eye-balled estimate of a paper's overall similarity is obtained.
- High-similarity arXiv papers were downloaded and checked to make sure that the authors were the same. This is important, because if the arXiv is by non-authors, then the paper is to be rejected, if not, the paper can be accepted if there are no other problems.
- For each "top-level" source (these are generally collections of many sources, and are numbered), click to expand the group and show sub-sources.
- Make notes for each sub-source (see "Example of detailed CrossCheck analysis notes for a paper" below).
- create a one-sentence summary comment that is easy to group into classes of papers for further action by the Chairs for [accept, reject] on the basis of CrossCheck.
- CrossCheck doesn't work for seeing if data in tables is new, nor will it always highlight similar [concept, formulae & derivations, data, results]! While it's best to check key "external papers" for [concept, formulae & derivations, data, results] to look for [deeper, more serious] plagiarism than simply the wording, I did not purchase any papers for checks (I am retired and this is prohibitively expensive). Where available for free, I downloaded high-similarity papers that CrossCheck highlighted as sources of similarities.
- I DON'T use the CrossCheck printout of results - this is essentially a pdf ooverview of the results without links to key sources and their details (which can still be handy), but the full CrossCheck results would be HUGE even for a modestly complex report. I think its more useful to have a short text summary for reference and follow-up by others, who can look on CrossCheck themselve to delve into the analysis in detail. Futhermore, coding (like >>>>>" at the start of a line in the text file containing all your analysis) make it easy to cull out summary comments for all papers in the batch.
- After finishing analysis of batches under consideration, create a listing of the summaries for each paper (see below ""), and sort the papers into "Recommended action bins" (see below "Example of sorting paper summaries into action lists"). This is part of the "text file containing all of my CrossCheck results".
- Submit the "text file containing all of my CrossCheck results" to the General Co-Chair and other Chairs who make the final decision.
As a rough guide of how this was applied for IJCNN2019 :
- >=50% similarity - All such papers are rejected EXCEPT when special conditions apply :
- similarities come from an arXiv posting by at least one of the same authors. This includes cases where the entire paper was pre-posted to arXiv.org. At present, other arXiv-like sites are not accepted, although the IEEE may add other sites to the list in the future).
- a paper draws extensively from a thesis by one of the authors
- a paper is close to 50% similarity, and mostly involves a diverse set of source papers (mostly the authors' previous papers), and fine-grained (eg sentence to paragraph level) similarities rather than >1/2 page "chunks" of continguous text from the same source.
- 30% < similarity < 50% - Rejection is primarily based on :
- "chunks" of contiguous text >= 1/2 a page in length, rather than [phrase, sentence, paragraph] level, with an emphasis on [theory development, data, results, discussion, conclusions], and less of a focus on [Introduction, Literature Review, References]
- AND one of :
- >10% maximum similarity with a source by different authors, although two or more at 8-12% can be a problem
- >20% maximum similarity with a source by at least one of the same authors
- a full page of contiguous text from a single source is a red flag, possibly acceptable if other similarities are low.
Examples of detailed CrossCheck analysis notes for a batch of papers
Here is an example report of detailed analysis of a CrossCheck results for a batch of files, including a summary of recommendations for each paper, grouped into "recommended actions". In this case, only papers with CrossCheck results >=50% similarity were covered, as the first stage of analysis. The same approach is used for papers with (30% < similarity <=50%), but given the huge number of such papers, it is only practical to do a sampling of papers, perhaps 5-10 in a hundred.
For privacy reasons, [authors' names, paper titles] have been replaced by one of :
<.IJCNN paper authors' names.>
<.IJCNN paper title.>
<.Other paper authors' names.>
<.Other paper title.>
The example from IJCNN2017 is provided in this image - short commented results of the analysis in all targeted cases (right-click to open in a new tab or window) :
CrossCheck IJCNN 2017 analysis of close checks
These examples also bring out cases where "plagiarism" isn't really a problem (the [course notes, PhD thesis] examples), and how "self-text-similarity" is actually the major portion of CrossCheck results - noting that excessive self-text-similarity is still reason for rejection. All author names and information that might betray privacy has been removed.
Notification of rejection on the basis of CrossCheck [results, analysis] involves :
While this is not necessary, it is handy to have an overall summary list of CrossCheck results. This allows easy comparison of summary CrossCheck results and [accept, reject] decisions, as well as quick checks for papers that haven't been submitted to CrossCheck for analysis.
- The Publications Chair passes the list of papers and analysis to the General Co-Chair.
- The General Chair makes the final decision on CrossCheck rejections and puts a comment into the paper submission system "PC MEMBERS COMMENTS", regarding the paper's rejection (see the comment in the Program Chair rejection notice).
- The Program Chair sends paper rejection notifications to all authors, which includes comments on CrossCheck rejections in the "PC MEMEBER COMMENTS", which invites the authors to contact the Publications Chair for details (see the Program Chair rejection notice).
- The Publications Chair responds to author inquiries about their CrossCheck rejection (see the Publications Chair explanation of CrossCheck results and analysis), including a generic comment, the CrossCheck print-out pdf, CrossCheck analysis comments, and offers to respond to any questions that they may have.
- The Publications Chair sends the list of rejections to the IEEE Intellectual Property group (email???), as they will need to decide on possible Publications bans.
After ALL conference papers have been uploaded to CrossCheck (remember - it takes up to 8 hours for results to be available!) :
- within CrossCheck : click on your "Conference papers" from the opening window (get there via the "My Publications" button near the top of most CrossCheck windows)
- For each web-page of listing (for ~1500 papers there are about 41 pages, so this takes time and is annoying!), select the entire list of papers, copy, paste-append to a common text file (eg "190112 CrossCheck similarities for papers on hand.txt", where 190112 = YYMMDD (year, month, date).
- For further analysis : Copy-paste the whole list to a spreadsheet in a workbook with a list of conference [papers, authors].
Does a high diversity souces help "alleviate" the [self, external] text-similarity issue? Consider the case where most of the text similarities come from a high number of [the authors' previous, non-author] publications. Furthermore, assume that these are well-interspersed in the IJCNN paper.
- Many authors need a [free, simple] system (eg simple Unix scripts) to pre-estimate self-similarity from a set of their own pdf publications, as most rejections could easily be avoided!! Others have access to [CrossCheck (iThenticate), NTU's TurnitIn, etc] and should use those tools.
- For the IJCNN2019 decisions it would really help if the columns "maxChunk (pg)", "max%Group' : [self, external] similarity were automatically calculated, but I guess CrossCheck can't accomodate a jillion different basis for decisions.
- From: Antonio Luchetta <@unifi.it>
- Before to launch the CrossCheck test, set the "Exclude Bibliography" flag to ON, as the journals' editors do; the references, for their own nature, are fully overlapping other bibliographies or paper titles and quickly increase the similarity index, without being plagiarism at all!
- Exclude the sources under a given threshold (1-2 %); our paper, for instance, reaches 43% similarity index with 54 (!) sources with less than 1% overlap and it could probably reach also the 99% if we add among the sources the Holy Bible and Divina Commedia, whereas the "real" (auto)plagiarism index is less than 10%.
- a text-file output of the FULL list of one-line summary results for all papers would be very handy. It's [long, laborious, boring] todo that by copy-paste of each web-page of the listing! This list is needed to make sure that papers have not been missed.
- text output for copy-paste!!! (hugely important for documenting decisions and allow others to [follow-up, repeat to confirm]).
- too much time is spend on checking into "safe" similarities, so that one is not able to scrutinize for real plagiarism.
Small improvements to CrossCheck could help reduce this problem
- arXiv and other accepted pre-publishing onling postings :
- ?List of similar search results? - text output of [title, authors] for similar papers
- if [authors, title] of arXiv paper are same as IJCNN submission - don't include in stats and don't list!!! This would save HUGE time!
- A real-time list of IEEE accepted arXiv-like sites that authors can pre-post papers to be submitted would help, as it appears we will see more and more examples of this. For now, we assume that ONLY arxiv.org is acceptable!!
- I have the distinct impression (non-confirmed) that some papers are possibly being HUGELY penalised because there is text from another paper that appears very widely on the web (different publications, author sites, etc) - i.e. multiple equivalent sources. Even if there are only one or two papers like that at 7 to 15% levels, the repeat sourcing can lead to overall similarity >50%, and a paper rejection by current policy.
- My own feeling is that the CrossWord threshold (which may already be flexible) could be set for slightly higher number of words per phrase, to eliminate excessive hits on common phrases? see below "Question : Is a strong level of [diverse, fine-grained] [self, external] text-similarity acceptable?". I'm sure that this issue has been discussed at length by those who have contributed to the system. But I suspect that there is a chasm between the not-always pragmatic criteria, and what people really do, as the time required is very large, and the criteria not always pragmatic. (It's easy to dream up a huge number of ideas, it's much more important to evolve these things by practice).
- I'd like to see text reports of the results of analysis for papers in other conference to see how they do it, and whether their documented [criteria, analysis] for all conference papers (not just the rejected ones) are adequate for responding to authors' challenges. I'd like to see evidence of others doing much more than simple, cursory decisions, that may be somewhat arbitrary, personal, and ill-defined.
- Suppose that a long phrase, in an area of expertise like neural networks, has been commonly used by more than say 3 to 10 authors. Then, as a pragmatic approach, a collection of such phrases specific to a conference could be deemed to be "public domain" as far as CrossCheck is concerned, and wouldn't be reported nor would the associated hits contribute to the overall rating of a paper? That might hugely complicate CrossCheck, given the huge number of conferences, so I don't know if this is practical.
- Perhaps CrossCheck is missing the most critical plagiarism checks : One cannot expect CrossCheck to catch "clever plagiarism" involving perturbations of [concept, concept, formulae & derivations, data, results].
But it can help to zero in on these, somewhat?
- One cannot text search through authors' paper within iThenticate, so links and other information must be re-typed.
- We need more than one person doing the CrossCheck analysis, given the time needed to do them so papers can be rejected BEFORE sending them to review (and wasting the time of MANY reviewers). It seems that uploads by others may not show up on each person's list even if the same publication name is used? (I'm not sure yet) If so, it would be handy if everyone on the team could access the complete list, as we can then each take a look to help each other out with interpretations. The General Co-Chair makes the final decision, so access to all CrossCheck analysis for the conference is especially important in that case.
A paper can have high overall similarity, even exceeding 50%, just from this type of situation, without having copies of entire paragraphs, and even where no source accounts for more than 2-5% similarity. Do we want to force somewhat low-level word-crafting by authors for cases like this "just for the sake of avoiding the perception of a repeat paper with nothing new"? Is that productive, does it help readers or remove the impression that they read the same paper before, is it required for copyright reasons?
Take the "extreme, trivial case" of single-word similarity. In that case, ALL papers would obviously fail and be rejected, as we are forced to use a language. What about two to five word phrases? Obviously, CrossCheck doesn't include these extremes, and seems to have a lower cut-off of something like 6-10 word sequences. Could you even communicated if you avoided common phrases used in a language? Perhaps CrossCheck does eliminate most common longer phrases from being included in the overall similarity rating, I don't know.
IEEE CrossCheck Portal Guide :
"... Interpreting Matching Percentages of Individual Sources
It may seem that any source of matching text should be a concern, but in fact many matching sources are likely to not be the result of text-similarity. For example:
- < 1%-3% match—Occurs with small groups of similar words or a few short phrases. In general, there is little need to review these sources.
- 4-7% match—These matches can be similar single sentences or a small paragraph. One source at this level may not be an issue, but several sources at this percentage level could signify an overall problem with the submission.
So CrossCheck does suggest some degree of allowance, the question is how much and what criteria? It's important that suggestions be workable for volunteers who have never worked with CrossCheck before, and who don't want to spend a huge amount of time becoming experts on the analysis.
Another factor is the huge amount of time required to do the checks if done even at a basic level. That means we have much less time for looking at potential cases of real text-similarity (for example [concept, - perhaps no time at all if arXiv pre-publishing becomes common practice.
At present, papers are rejected at (30% < similarity) where there is a strong level of [diverse, fine-grained] [self, external] text-similarity. My own feeling is that at some level, a bit higher than the CrossCheck threshold, this could be acceptable?
(Note: thesis material considered non-published for CrossCheck)
- hal.archives-ouvertes.fr, tel.archives-ouvertes.fr
Frequently Asked Questions (FAQs)
The following selection from the IEEE Support Center FAQs may be of help, and you can click the preceding link for additonal help. Links are as of 01Oct2018, and change occasionally. I don't know if you have to be an IEEE member to access these pages (I haven't checked). As usual, it often is easier to web-search using [Duckduckgo, Google, Bing, etc] "IEEE FAQ and What graphic file formats are supported by the IEEE Graphics Analyzer Tool?" (as an example). Readers may find [better, really useful] IEEE FAQs, or if they have their own advice to add to this, please email your [links,comments] to me.
What is the conference ID? The Conference ID is your conference record number that's been assigned to your conference followed by "x" or "xp" that is required for setting up and logging into your author account. If your...
Publications Chair comment : for IJCNN2019 Budapest Conference Record = #46175.
Directory of available files for this webpage
Many thanks to our Sponsors & Exhibitors
Ongoing support : [IEEE, IEEE-CIS] for [CEC, FUZZ, IJCNN]; INNS for IJCNN; [IET, EPS] for CEC
2020 IEEE World Congress on Computational Intelligence, Glasgow, Scotland.....19-24 July 2020
2019 International Joint Conference on Neural Networks, Budapest, HUNGARY.....14-19 July 2019