Authors & Publish chair page, blog	Paper formatting page, blog	Initial paper submission chair, page, blog	Final paper submission chair, page, blog	Problematic papers - corrections page, blog
	Author [PDF,CrossCheck] tests page, blog	IEEE PDF eXpress - paper format chair, page, blog	IEEE electronic Copyright (eCf) chair, page, blog	Attendee downloads of papers page, blog
	Conference registration page, blog	Travel visas to Hungary page, blog	Conference presentations page, blog	HELP contacts WCCI2020, system
Non-Author actions	Paper reviews - authors' perspective page, blog	IEEE CrossCheck text similarity chair, page, blog	IEEE Xplore web-publish chair, page, blog
	IEEE Conference Application chair	IEEE Letter of Acquisition chair	IEEE Publication Form chair	Software systems page

Conference Guides for [IEEE-CIS, INNS] conferences : Although this guide was initially set up for IJCNN2019, and was used for IEEE WCCI 2020, hopefully some of the basic information may still help [organising committee members, authors] up to 2023-2025 (assuming a half-life of links and process information of 3-5 years?). The Authors' and Publications menus have been combined, allowing authors to see the Publications Chair perspective as well. I am no longer keeping this up to date. Bill Howell, 11Dec2020

Blog : IEEE CrossCheck Portal - understanding the text-similarity checks

The "Authors' Guide Blogs" (see links in manu near top of page) are in an early draft form, and are being run on a trial basis. Comments are moderated by the Publications Chair before posting, so expect delays of at least a day or so before they appear. Note that emails have been edited - usually by [omitting salutations, endings], but also by omitting material not relevant to the "theme" under which the emails are placed below.

31Jan2019 - Extensive postings and some changes to this blog (notably the list of themes immediately below) have been introduced now that the CrossCheck analysis is well underway. Note that there are many IJCNN2019 papers that may still be affected by CrossCheck, but at this point we may have to wait until the peer review process is completed.

Authors should be [warned, armed, prepared]!
How close was my paper to being rejected by CrossCheck analysis?
Author concerns that CrossCheck is neither appropriate nor powerful enough
- Replies regarding the limitations of CrossCheck
Examples of novel papers failing CrossCheck analysis
- Replies regarding exceptions for novelty
Special criticism of CrossCheck rejections
Shouldn't there be a difference between "core" versus "peripheral" sections of a paper?
- Replies regarding "core" versus "peripheral" sections
Authors' suggestions for improvements to CrossCheck
- Replies to improvements to CrossCheck
Author's initial inquiries and replies about how to prepare for and what to expect from CrossCheck
Pre-posting of papers to arxiv.org and other sites
Typical preliminary response to authors whose paper was rejected based on CrossCheck similarity analysis

Authors should be [warned, armed, prepared]!

29Jan2019 Howell : We have lost far too many great papers in an entirely avoidable fashion, as well as upset many authors (as with IJCNN2017). However, it's too late for IJCNN2019 for the ideas below. They relally need to be warned, and have access to the tools at least by the time that paper submissions are first opened.

Authors who have access to [iThenticate (CrossCheck), NTU's TurnitIn, etc] should use these tools BEFORE submitting their papers.
What I really regret is NOT having posted a [free, simple] pre-check tool for authors EARLY on (as many authors do not have access to CrossCheck or other equivalents), long before the paper submission date. Although it may be easy to program something simple, it would take time, plus it may not be easy to get the minimum required [accuracy, reliability] for authors to be comfortable that they do comply.
- Maybe there is something already out there for that purpose?
- Maybe iThenticate has something to apply to a paper that has been written and a few related publications by the authors?

How close was my paper to being rejected by CrossCheck analysis?

Authors of papers who passed the CrossCheck analysis may still want to know their paper's measures with respect to the IJCNN2019 criteria. There are many papers uncomfortably close to the borderline, and in the future thay will want to be comfortably under the limit of the criteria to ensure acceptance. This is especially important given the inherent uncertainty of the measures, especially the "maximum Chunk size", which is a subjective measure. Also note that criteria vary according to the conference!

Look at the following workbook to your paper's measures according to the criteria.

Here is a summary listing of IJCNN2019 CrossCheck summary outputs, to which we have applied criteria as a screen, and after which further investigation is often required. Also, given the volume of work and my haste in doing it, there are numerous errors in the spreadsheet IJCNN2019 CrossCheck.
A [general, outdated, sometimes erroneous] description of the checks is found on the Authors' Guide : IEEE CrossCheck web-page.

Examples of novel papers failing CrossCheck analysis.

There were several of these, and the list below is not complete!

+-----+ Subject: Re: IJCNN 2019 Paper #19480 - CrossCheck text-similarity analysis
Date: Wed, 30 Jan 2019 16:53:16 +0100
From: Parisa RASTIN
To: Bill Howell. Hussar. Alberta. Canada

As you explained in your email, there is a "self text-similarity" between one of our previous article and our new article. However, the main similarities (0.8 pg chunk) are in section II : "background and related works", with some smaller chunks in the "data-sets description" and "quality measures definitions". All of these sections are state-of-the-art and there is not reason for us to describe it differently from our previous paper published in IJCNN 2018. We describe the previous approaches, as well as state-of-the-art data sets and quality indexes, in order to show the efficiency of the new algorithm and the improvements in comparison to our previous work. Please note that the IJCNN 2018 paper is actually cited several times in the manuscript (citation [15]), although there seems to be a typo in the document such as only the conference name and the year appears, without the author's name and the paper title.

Our paper does propose new and unpublished work. In particular, Sections III and V, which explain the new algorithm and the experimental results, have only marginal similarity with other papers. In the .xlsx document you provided, other papers with similar self text-similarity and bigger page chunks seem to have been accepted when the similarities were mainly in state-of-the-art sections.

We wish that you could revise your evaluation based on these information and let us have the chance to have our paper reviewed.

-+-----+
Date: Mon, Jan 28, 2019 at 7:10 PM
From: Parisa RASTIN <@lipn.univ-paris13.fr>

Dear IJCNN General Co-Chairs,

I have received an email on behalf of the IJCNN 2019 technical program committee and technical chairs that my paper (ID: 19480) has failed IEEE Crosscheck plagiarism analysis.

This is very surprising for us, as the proposed approach, as well as the experimental results, described in the paper have never been published elsewhere.

Some part of the databases description and the definition of well-known quality indexes used in the experiment are similar to a previous paper by the same author published in IJCNN 2018, but should not be considered as "plagiarism". Online tools we found to compare the two papers only give a 13% similarity.

If we really have a big overlap with other papers, we would be very interested to have more information about the result of the plagiarism analysis, as we have no access to the IEEE Crosscheck tool.
However, if the overlap is only state of the art based on our own papers, we probably deserve a revision of your decision and a chance to have a review of our work.

+-----+
Date: Fri, Jan 25, 2019 at 2:58 PM
From: Antonio Luchetta <@unifi.it>
To: Chrisina Jayne,

seen that is not included in the "reject" revision you sent us, could you kindly send us the Crosscheck report of our paper? We cannot directly use Crosscheck, but using other antiplagiarism software (usually very consistent with Crosscheck) we have a result of less than 10% of (auto)plagiarism.

We have been really dismayed by the decision of excluding our paper because of a preliminary plagiarism check.
The work described in the paper is deeply novel in both theoretical development and experimental part and never published elsewhere! Of course there were, behind it, fundamental theories developed by the authors in the past, and a part which describes and references these methods, as it is natural in every research work and yourselves declaim in the plagiarism Notes of Authors' Guide. At least we will aware of the unacceptable part of the paper and we will be able in future to avoid similar outcomes.

-+-----+
Subject: Re: IJCNN2019 CrossCheck analysis and paper rejections
Date: Thu, 24 Jan 2019 14:06:47 -0500
From: Supriyo Bandyopadhyay <@vcu.edu>
To: Bill Howell. Hussar. Alberta. Canada , Amit Trivedi <@uic.edu>, Justine Drobitch <@mymail.vcu.edu>, Shamma Nasrin <@uic.edu>, Chrisina Jayne , Supriyo Bandyopadhyay <@vcu.edu>

Thank you for getting back to me. This paper builds on our previous work in ref. [5] and therefore there will naturally be some similarities between this work and reference [5]. In fact, the last sentence in the first page states categorically "This work extends on our prior work [5] to consider DBN".

It is common practice to extend one's work (or even somebody else's work), as long as the original source is cited. We have carefully cited the original source.

My guess is that the Cross-Check algorithm is not smart enough to catch all this and therefore has run amok. We would like a resolution of this and will wait to hear from you after you have completed your work. This has nothing to do whether the paper is accepted or not.

Replies regarding exceptions for novelty

31Jan2019 Howell : To date, papers have not been re-assessed on the basis of novelty, not have authors been given a chance to re-word papers to pass the IJCNN2019 CrossCheck criteria.

Author concerns that CrossCheck is not appropriate or powerful enough

+-----+
Subject: RE: 2019 IJCNN paper #20386 cross check
Date: Wed, 30 Jan 2019 15:25:02 +0000
From: Dr Anna Koufakou <@fgcu.edu>
To: Bill Howell. Hussar. Alberta. Canada
CC: Garrett Fairburn <@eagle.fgcu.edu>, Vilma Un Jan <@eagle.fgcu.ed>, Daniel Scamardella <@eagle.fgcu.edu>

Ok, thank you. I am not sure if the tool is the issue. A good policy for a conference should address duplicate or almost identical work, not similar text according to a percentage.

+-----+
Subject: Re: IJCNN 2019 Paper #20203 - CrossCheck text-similarity analysis
Date: Mon, 28 Jan 2019 06:37:50 -0800
From: Supriyo Bandyopadhyay <@vcu.edu>
To: Bill Howell. Hussar. Alberta. Canada
CC: Shamma Nasrin <@uic.edu>, Justine Drobitch <@mymail.vcu.edu>, Amit Trivedi <@uic.edu>

I read the report and my personal opinion is that this Cross Check robot is not smart and should never have been used. It even tags the references as similar. If we are publishing two different papers on relativity and both cite Einstein’s papers, your robot will tag that as self similarity (just because the same papers were cited) and reject the second paper on that basis!

Authors publishing multiple DISTINCT paper on the same subject will obviously have a lot of text similarity and this robot is not smart enough to discriminate. If you are using this robot, your conference will be a disaster.

Replies regarding the limitations of CrossCheck

-+-----+
Subject: RE: IJCNN 2019 Paper #20203 - CrossCheck text-similarity analysis
Date: Tue, 29 Jan 2019 16:00:13 -0700
From: Bill Howell. Hussar. Alberta. Canada
To: Supriyo Bandyopadhyay <@vcu.edu>
CC: Shamma Nasrin <@uic.edu>, Justine Drobitch <@mymail.vcu.edu>, Amit Trivedi <@uic.edu>

Supriyo Bandyopadhyay - I entirely agree with you that there are issues with CrossCheck (iThenticate based), but it's easily the best I've got. Actually, it's the ONLY tool I've got to assess when decisions are to be made for >444 papers! It does provide a "level playing field" so that all authors are affected by the same quirks of the system, and judged by the same criteria, somewhat equally. I understand that some journals (OK, at least one that I know of) apply CrossCheck (iThenticate) criteria that are much more severe, which would result in many more IJCNN2019 paper rejections. If you look at my own personal estimates ("OK? calc" column), they are probably worse (high irreproducibility with text-only, even if I am not including References etc, and even there I rely on the CrossCheck background highlighting of text).

+-----+
Subject: RE: 2019 IJCNN paper #20386 cross check
Date: Wed, 30 Jan 2019 10:53:21 -0700
From: Bill Howell. Hussar. Alberta. Canada
To: Anna Koufakou <@fgcu.edu>
CC: Garrett Fairburn <@eagle.fgcu.edu>, Vilma Un Jan <@eagle.fgcu.ed>, Daniel Scamardella <@eagle.fgcu.edu>

"Text-similarity" [acceptance, rejection] of papers is still a required screening for the IEEE and conference [General, Program, Technical] Co-Chairs. Although the initial reason for CrossCheck (iThenticate) may have been for plagiarism, I get the feeling that excessive [self, external] text-similarity is itself of concern. Whether journal copyrights or publishing industry agreements also have a role in this I don't know.

Do you have any further [ideas, suggestions] for "non-trivial" (beyond text-similarity) plagiarism checks?
- Other than already relying on peer reviewers, how do we rapidly do quality checks for 1,500 papers with tight deadlines?
- Perhaps my verbose comments below may stimulate your comments, or perhaps even better yet, questions?

+---+
26Jan2019 Howell : One can go very deeply into the CrossCheck results for a paper, far beyond the summary level in the attachment to this email. My notes for papers >50% similarity do some of this, but to a very limited degree. Unfortunately, even after putting in many hours doing "detailed notes" and "Eye-ball estimates", I still can't reproduce even the summary analysis, and I cannot explain the differences between the "grouped" similarities versus the "low-level, single source" similarities, which can much [higher, lower] that one another. At present, then, I am restrained from using CrossCheck as it is, so as to avoid introducing even more errors into the analysis.

Shouldn't there be a difference between "core" versus "peripheral" sections of a paper?

+-----+
Subject: Re: IJCNN 2019 Paper #19480 - CrossCheck text-similarity analysis
Date: Thu, 31 Jan 2019 12:27:53 +0100
From: Parisa RASTIN <@gmail.com>
To: Bill Howell. Hussar. Alberta. Canada

It seems that you are aware that there is some bias in the process, potentially leading to the rejection of unpublished work.
You can add my email to the blog if you wish.

One last question : in the pdf you send us indicating the similarities in our paper, the whole reference section seems to be included in the analysis and the computation of the percentages.
If it's the case, it produces a significant part of the self-similarity.
Shouldn't the reference section be excluded from the CrossCheck test ?

Replies to difference between "core" versus "peripheral" sections of a paper

+-----+
Subject: RE: IJCNN 2019 Paper #19480 - CrossCheck text-similarity analysis
Date: Wed, 30 Jan 2019 12:20:15 -0700
From: Bill Howell. Hussar. Alberta. Canada
To: Parisa Rastin <@lipn.univ-paris13.fr>

You are absolutely correct that the current criteria do NOT account for whether similarities are in the [Introduction, related papers, common, approaches] sections, as opposed to the "meat of the paper" in the [new theory & approaches, data, analysis, results, conclusions] sections. I think I have mentioned that in the Authors' Guide : IEEE CrossCheck.

But already I was only just able to do the analysis with the simple criteria, and even then I was late in providing results for decisions. Some rejection notices may have to wait until after peer review is complete, even though we could apparently use reviewers in short supply for other papers. Although I'd prefer to see that implemented, already the current approach is too [unwieldy, time-consuming] unless CrossCheck is improved to auto-output the criteria measures.

Authors' suggestions regarding improvements to CrossCheck.

+-----+
Subject: Re: Fwd: IJCNN 2019 Paper #19536 - CrossCheck text-similarity analysis
Date: Mon, 28 Jan 2019 12:37:01 +0100
From: Antonio Luchetta <@unifi.it>
To: Bill Howell. Hussar. Alberta. Canada
CC: ijcnn2019@gmail.com

Dear Bill,
thanks for reviewing and re-interpreting the Crosscheck report.
I agree to post to your blog my previous email, and I add a couple of further suggestions, as you ask me to do:

Before to launch the CrossCheck test, set the "Exclude Bibliography" flag to ON, as the journals' editors do; the references, for their own nature, are fully overlapping other bibliographies or paper titles and quickly increase the similarity index, without being plagiarism at all!
Exclude the sources under a given threshold (1-2 %); our paper, for instance, reaches 43% similarity index with 54 (!) sources with less than 1% overlap and it could probably reach also the 99% if we add among the sources the Holy Bible and Divina Commedia, whereas the "real" (auto)plagiarism index is less than 10%.

Best regards,
Antonio Luchetta, on behalf of other authors

Replies to improvements to CrossCheck

Howell's [incomplete, ongoing, scattered] list of comments regarding CrossCheck analysis can be found on the Publications Guide : IEEE CrossCheck web-page. Note that the Publications Guide website is NOT the same as the Authors' Guide website, althought the former follows the same layout and themes as the latter. It is provided to help Publications Committee members who are new to the task by providing a detailed description of what I did for IJCNN2019, and to a lesser extent IJCNN2017. [chair, blog] links are provided in the Publications Guide that lead right back to the Authors' Guide, but the Authors' Guide does NOT link to the Publications Guide in order to avoid confusion for authors, as the purpose is different, and the Authors' Guide contains the key information for them anyways.

Author's inquiries and replies about how to prepare for and what to expect from CrossCheck

+-----+
Subject: IJCNN 2019 Submission
Date: Fri, 21 Dec 2018 17:56:27 +1100
From: Meng Yang <@student.unimelb.edu.au>
To: Bill Howell. Hussar. Alberta. Canada

Sorry to bother you, could I ask the detailed requirement of similarity checking? Like the similarity must be below xx%? Because there are some common phrases and expressions in my paper, and they cause some similarities..

It would be great if you can help clarify with it. I really appreciate for your help.

Many thanks,
Meng

+-----+
Subject: Re: IJCNN 2019 Submission - CrossCheck similarity % for plagiarism
Date: Fri, 21 Dec 2018 10:12:08 -0700
From: Bill Howell. Hussar. Alberta. Canada
To: Meng Yang. PhD Student. Uof Melbourne. Australia <@student.unimelb.edu.a>

Oops - I didn't properly answer your question (below)! Sorry.

ALL paper repeat common phrases and expressions! For example, for IJCNN2019, at one point :
41/518 approved papers had <=10% similarity
209/518 approved papers had <=20% similarity
(518 approved papers - I think this was before the final count?)

So you almost CANNOT normally have very low similarity levels.

+-----+
Subject: Re: IJCNN 2019 Submission - CrossCheck similarity % for plagiarism
Date: Fri, 21 Dec 2018 09:34:19 -0700
From: Bill Howell. Hussar. Alberta. Canada
To: Meng Yang. PhD Student. Uof Melbourne. Australia <@student.unimelb.edu.au>

Weird timing, Meng! Yesterday I scrambled to write up very [incomplete, inaccurate, speculative] notes for the IJCNN2019 Technical Co-Chairs Khan and Dongbin on this very subject.

However, the key summary context that I have at the moment is summarised in the Authors' Guide : IEEE CrossCheck Portal :

CrossCheck is NOT a complete plagiarism check at all, but it is a very handy and powerful tool towards some screening to better focus on potential plagiarism cases, and to provide details for investigations. We still rely very much on the reviewers' knowledge to pick out cases. Of course, the authors of plagiarised papers may see obvious cases and complain to the IEEE, but everyone wants cases to be caught BEFORE they are published!

So :

Reviewer comments will certainly be [investigated, targeted]
CrossCheck Level I (>=50%) will certainly be targeted (not all should be rejected - but they should be partly re-written if not rejected)
CrossCheck Level II (20-50%) will attract attention. This is especially the case for papers with potential plagiarism of other author's work, with perhaps emphasis on non-introductory parts of the paper? I think that toolsets will rapidly improve to more efficiently address these cases, because at present a significant portion of conference papers fall into this class.

As noted in the "Authors' Guide : IEEE CrossCheck Portal - understanding the plagiarism checks", the great majority of plagiarism is "self-plagiarism" of the authors' previous work. I've commented about that on the webpage, and opinions of others on that subject would be interesting. Nobody wants to spend time reading a paper that looks like a rehash of a previous paper they've read which may not yield significant new [insight, results, etc], or that makes it very hard to pick out what is new in a paper. On the other hand it's probably a little annoying for authors to repeatedly "wordcraft" minor changes to [introduction, established methodologies, theoretical basis] background material that mostly provide context. To me, when there is an excellent description of something, why make it worse? There are reasons - by rewriting things, sometimes [flaws, improvements, etc] become obvious, and the descriptions can improve over time. Different perspectives or analysis are of course important too - but that's not plagiarism.

In the end, plagiarism [checks, decisions] are not cut-and-dried, mechanical processes, at least from what I see.

Pre-posting of papers to arxiv.org and other sites

+-----+
Subject: IJCNN2019 CrossCheck - allowable arXiv-like sites
Reference #: 190128-000901

Customer By CSS Email (William Howell) (01/28/2019 01:55 PM)
As we are finalising paper rejections based on CrossCheck results over the next 3 days, can you confirm whether or not arxiv.org is still the ONLY site permitted by the IEEE for pre-postings of conference papers pre-paper-submission?

As an update of author requests, the following sites are of particular concern :
hal.archives-ouvertes.fr, www.groundai.com, biorxiv.org, tel.archives-ouvertes.fr, www.aclweb.org,

Mr. Bill Howell

+-----+
Subject: IJCNN2019 CrossCheck - allowable arXiv-like sites [Reference #: 190128-000901]
Date: Tue, 29 Jan 2019 12:19:03 -0500 (EST)
From: IEEE Support Center
Reply-To: IEEE Support Center
To: bill@billhowell.ca

To best answer your question, I refer you to the IEEE Author Posting Policy at https://www.ieee.org/publications/rights/index.html#author-posting-policy.

"The revised policy reaffirms the principle that authors are free to post the accepted version of their articles on their personal websites or those of their employers. (Authors of IEEE open access articles may freely post the final version of their papers.) The policy provides that IEEE will make available to each author a pre-print version of that person's article that includes the Digital Object Identifier, IEEE's copyright notice, and a notice showing the article has been accepted for publication. The policy also states that authors are allowed to post pre-print versions of their articles on approved third-party servers that are operated by not-for-profit organizations. "

More detail can be found on the page, but I think it's safe to say that there are many archive sites available and a list could never cover them all, and would be difficult to keep up to date.

I suspect that many of your submissions are not plagiarism, but you probably want to make sure that the level of multiple publication is not to high, if you consider the overlap to be of accepted papers that are about to be published somewhere other than you conference.

The IPR CrossCheck team will also be reviewing uploaded content and will make you aware of any misconduct we encounter. You can send me an email and let me know the dates you uploaded your content to Crosscheck and I would be happy to take a look.

Beth Babeu Kelly
Manager, Intellectual Property Rights
IEEE Publications

Special criticism of CrossCheck rejections

This exchange follows a well-framed criticism of the validity of using CrossCheck to pre-screen paper submissions, including references to the IEEE Operations Manual. Feedback was sought from the IEEE Publications Group, confirming the basis of the authors' criticism on the basis of IEEE policy, but actual practice (not just IJCNN2019's) does not follow the policy. Co-author names are camouflaged for privacy.

+-----+
Subject: RE: IJCNN2019 - challenges to paper rejections on the basis of CrossCheck Analysis [Reference #: 190211-001110]
Date: Tue, 12 Feb 2019 17:59:25 -0700
From: Bill Howell. Hussar. Alberta. Canada
To: Beth Babeu Kelly. Manager - IP Rights. IEEE Publications <@ieee.org>
CC: Chrisina Jayne. General Co-Chair & Workshop Chair. IJCNN2019 Budapest. INNS Director. Oxford Brooks U. Oxford. UK, Co-Author1, Co-Author2, Khan Iftekharuddin. Technical Program Co-Chair. IJCNN2019 Budapest. Norfolk. VA. USA, Dongbin Zhao. Technical Program Co-Chair. IJCNN2019 Budapest. Chinese AcadScience. Beijing. China

Thanks so much, Beth! Your clarifications really help :

Generally speaking, the conference publication peer reviewers determine the amount of redundant work allowed in a a newly submitted manuscript, however, the authors need to cite their previous work. You can choose to have them resubmit another version of the paper once they have updated the reference to their own work and made any changes you suggest.

In our case, the [General, Program, Technical] Co-Chairs decided on the criteria, not the reviewers. In general, reviewers do not have access to CrossCheck, and it would be hard to ensure consistency of application of the [rules, criteria] by an army of reviewers, so this was done by Khan, Dongbin, and myself. (I have cc'd Khan & Dongbin on this email).
We need to also remember that an excessive overlap (>50%)of another authors work, although properly cited, may be a flag of a potential copyright infringement. did the submitting author receive permission to reuse this work from the original authors?

This clarifies a criteria for when copyright infringement becomes a concern. Years ago, a Special Issue of "Best Papers" was produced from IJCNN, and published as a Special Issue of Neural Networks. I believe that the criteria was a maximum of 40% similarity, but new [data, graphs, tables, results] were also required, which was much lower than the 75% you mentioned for a Special Issue, but it involved IEEE copyrighted material being used in Neural Networks.
Unfortunately, in my experience, we do not see quoted text often, so "citation" refers to the bibliographic information associated with the published work, which is not cited in the reference section of the submitted manuscript.

As I was beginning to suspect, based on the reactions of several authors. It's a bit of a pity, as the original authors are not really given full "obvious" credit simply by putting in the reference #, as there is no way of knowing how much material came from them without digging into the reference.
Can you please refresh my memory and tell me how you and your colleagues are accessing CrossCheck? Have you engaged a vendor or are you using the IEEE CrossCheck portal?

We are using the CrossCheck Portal, set up through IEEE conferences back in September, I think. As stated previously, our CrossCheck [criteria, analysis] for all papers >30% overall similarity (self and external) is in the IJCNN2019 CrossCheck spreadsheet.
The CrossCheck tool is far from perfect, and it requires human interpretation, but until we can find a way to interpret these similarities through some automated process, it is the best tool we have.

I think that CrossCheck is a [powerful, useful] tool, but we have to be very careful how it is applied, and it is important to fully inform the authors of the [rules, criteria], PLUS they need tools to access their papers, PRIOR to paper submission, as this would eliminate almost ALL of the problems. It would also dramatically reduce the amount of work required by the CrossCheck analysts!! We have lost far too many great papers just because authors had no idea of where the boundaries were, and how their paper compared to them! CrossCheck would be an awesome tool for an historical study if the development of a scientific concept, and for many other applications. From my standpoint, the only things missing at present for our analysis, are that the CrossCheck summary listing, in addition to "overall" similarity, needs numbers for "[self, external] similarity" and "[self, external] maximum Chunk size" (lines). But that would not be a big change to CrossCheck! (famous last words - but what CrossCheck already does is much more difficult than making those distinctions).

+-----+
Subject: Re: IJCNN2019 - challenges to paper rejections on the basis of CrossCheck Analysis [Reference #: 190211-001110]
Date: Tue, 12 Feb 2019 16:45:22 -0500
From: Beth Babeu Kelly <@ieee.org>
To: Bill Howell. Hussar. Alberta. Canada
CC: Chrisina Jayne. General Co-Chair & Workshop Chair. IJCNN2019 Budapest. INNS Director. Oxford Brooks U. Oxford. UK, Co-Author1, Co-Author2

I will do my best to clarify our criteria and review process again. Please see me responses to your statements in red below.

Can you please refresh my memory and tell me how you and your colleagues are accessing CrossCheck? Have you engaged a vendor or are you using the IEEE CrossCheck portal?

Beth Babeu Kelly
Manager, Intellectual Property Rights

+-----+
Subject: RE: IJCNN2019 - challenges to paper rejections on the basis of CrossCheck Analysis [Reference #: 190211-001110]
Date: Tue, 12 Feb 2019 11:16:16 -0700
From: Bill Howell. Hussar. Alberta. Canada
To: Beth Babeu Kelly. Manager - IP Rights. IEEE Publications <@ieee.org>, Chrisina Jayne. General Co-Chair & Workshop Chair. IJCNN2019 Budapest. INNS Director. Oxford Brooks U. Oxford. UK, Co-Author1, Co-Author2

Chrisina - On the basis of Beth Babeu Kelly's comments (see the IEEE Support Center email below), and the workbook of CrossCheck results, I recommend that you stop all further CrossCheck rejections.

Beth Babeu Kelly - Thank-you for the rapid response. This is very different from what was done for IJCNN2017 & IJCNN2019, as criteria were set for excessive "self-similarities". So Co-Author1 was right - there is no IEEE basis for rejecting papers with CrossCheck results, other than perhaps on a basis of excessive "external similarities", using the IEEE 4 levels of plagiarism. But that does not really reflect "serious" plagiarism, as if only refers to text similarities, and not whether [concepts, theories, formulae, data, results] were plagiarised. Real plagiarism would require far more work to ascertain. Most importantly, papers should NOT be rejected at all for any level of "self-similarities".

Yes, CrossCheck is just a starting point in identifying plagiarism. However, once serious similarity overlaps with or without proper referencing are identified, we can decide whether or not to adjudicate thoroughly. We need to also remember that an excessive overlap (>50%)of another authors work, although properly cited, may be a flag of a potential copyright infringement. did the submitting author receive permission to reuse this work from the original authors? The CrossCheck tool is far from perfect, and it requires human interpretation, but until we can find a way to interpret these similarities through some automated process, it is the best tool we have.

As an example, if whole pages of text are the SAME in a paper as in one of the authors' previous publications, and the overall Crosscheck similarity is 85%, then that is not a problem with the IEEE as long as there is novel content in the paper.

It is entirely possible to reject a manuscript based on what you refer to as "self similarities." We define that as multiple publication and we leave it to you, the editorial body, to define how much redundant work by the same author(s) is acceptable for your publication. An example would be when an author is invited to submit to a conference or journal for a special issue, and in that case 75% simialrity of a previously published work would be acceptable. In other cases, the conference or journal may have an expectation that no more than 35% of a previously published work is acceptable. It is also expected that the authors notify the editors or conference organizers that the submission contains some part of a previously published work.

Typically when multiple publication is discovered, the authors are informed of IEEE policy related to multiple submission/publication, which you will find below. Generally speaking, the conference publication peer reviewers determine the amount of redundant work allowed in a a newly submitted manuscript, however, the authors need to cite their previous work. You can choose to have them resubmit another version of the paper once they have updated the reference to their own work and made any changes you suggest. More information regarding multiple publication/submission can be found at,
IEEE Section_822F

In this scenario, the CrossCheck similarity outputs are not suitable for operational use, as it is necessary that CrossCheck should ONLY list papers with "external similarities", and the overall similarity in the summary listing should only include "external similarities". It took far too much of my time to go into each of the 440 papers >30% similarity to distinguish between [self, external] similarities, and to estimate measures for the criteria. Furthermore, to be of practical value, decisions must be made quickly after the paper submission deadline, otherwise a large amount of many peer reviewers' time would be wasted on papers that cannot be accepted anyways. For IJCNN2019, more than half of papers "to be rejected" have been peer reviewed because I was not able to do the analysis in time. Can your group rapidly do the analysis on >400 papers?

Yes, we review this amount of content on a daily basis. IPR staff receive the simialrity reports and review them within 2-3 days of submission., and will let you know when a submission with a similarity score of 30% or more looks suspicious. We will alert you to any incidents of plagiarism, Levels 1-3, and multiple submission/publications with a score of 75% or more. If you request specific assistance and would like to hear from us regardless of whether we find anything, please contact me directly.

If so, that is the way that future IJCNNs should be handled, by requesting this of your group.

The phrase "citation" is also ambiguous, as it seems clear from author responses that many authors consider the listing of references from which text has been used as a "citation", whereas I usually take that term as meaning that the text is quoted and stated as originating from a specific reference.

Unfortunately, in my experience, we do not see quoted text often, so "citation" refers to the bibliographic information associated with the published work, which is not cited in the reference section of the submitted manuscript.

While IJCNN2019 had ~440 papers out of ~1,500 with similarity > 30%, by ignoring self-citation and by using the looser definition of "citation" (as meaning that only the reference has to be listed), my guess is that almost none of the 477 papers would fail. In contrast, to date 25 papers have been rejected, and a further ~35 are targeted for rejection.

Beth, I appreciate your quick response and guidance.

+-----+
Subject: IJCNN2019 - challenges to paper rejections on the basis of CrossCheck Analysis [Reference #: 190211-001110]
Date: Tue, 12 Feb 2019 11:08:43 -0500 (EST)
From: IEEE Support Center
Reply-To: IEEE Support Center
To: bill@billhowell.ca

Subject: IJCNN2019 - challenges to paper rejections on the basis of CrossCheck Analysis

Reference #: 190211-001110
Response By Email (Beth) (02/12/2019 11:08 AM)

I am a little confused about the nature of your question. The short answer is that CrossCheck is just one tool to be used during peer review. It is the goal of the IEEE CrossCheck program to identify plagiarism in submitted content prior to publishing, and not to exclude publishing works that have generated high simialrity scores within the plagiarism detection application.

The IEEE CrossCheck team in the IPR Office reviews all the similarity reports generated from article submissions, with a similarity score of 30% or more. Reports need to be reviewed by a human, and excluding a manuscript based on a similarity score alone, is not fair to the authors or the publication to which they are submitting their work. We do not recommend excluding any data (bibliographies, abstracts, etc) from the similarity match parameters, just to simulate a lower similarity score. It is imperative that peer review judgements are not based on similarity scores alone, without an actual review of the results.

CrossCheck results will show the percentage of content that has been previously published. That content may be properly reused material, or the author's own previously published work. In those cases, it would not be plagiarism. Any paper that has a score over 30% similarity should be reviewed to make sure that there is not a substantial amount of un-referenced reused work. Any percentage of plagiarism would be unacceptable, but if there are only small bits of text that are scattered through the paper, it's possible that CrossCheck is only detecting common phrases.

The IPR Crosscheck team reviews all submitted content and will review anything at your request, should you have any questions about a particular submission. With that said, please forward any items you are concerned about, we will perform a quick review of the content.

Please contact me directly at @ieee.org to expedite any further concerns.

Beth Babeu Kelly
Manager, Intellectual Property Rights

+-----+
Subject: Re: IJCNN 2019 Paper #20406 Rejection Based on Software
Date: Mon, 11 Feb 2019 19:16:42 -0500
From: Co-Author1
To: Bill Howell
CC: Chrisina Jayne. General Co-Chair & Workshop Chair. IJCNN2019 Budapest. INNS Director. Oxford Brooks U. Oxford. UK, Co-Author2

I appreciate your text. When I have time, I would like to write an article about this issue. Sorry, I do not have time now (I reviewed the six IJCNN papers that have been assigned to me to review).

I would like to remind you about what you wrote "pre-screening". There are very strict IEEE rules for you to reject a paper based on pre-screening (desk rejection). Per IEEE Policies, you or any editor cannot reject any paper based on human "pre-screening" without agreement from at least two other associate editors, at alone your pre-screening is based on a software alone. See how bad a pre-screening could be:
A Theoretical Proof Bridged the Two AI Schools but a Major AI Journal Desk-Rejected It

+-----+
Subject: Re: IJCNN 2019 Paper #20406 Rejection Based on Software
Date: Mon, 4 Feb 2019 15:47:54 -0500
From: Bill Howell
To: Co-Author1
CC: Chrisina Jayne, Co-Author2

Thanks, Co-Author1 & Co-Author2. Good work. I've appended my comments (preceded by appending ">> ") for selected paragraphs that I have numbered, of your last email below. I agree with many of your points, perhaps not all, but most importantly, you do raise points that need to be answered by IEEE peer review and Intellectual Property (IEEE-IP) experts.

Additionally, I can do the CrossCheck analysis for your re-submitted paper, but I ask that this be delayed until 15Feb2019, as I am desperately trying to do my peer reviews, and catch up on long neglected [projects, matters of other societies, personal matters].

> From my perspective, your key points, and a couple of extra points, requiring IEEE feedback are :

1. Subsection 8.2.2.A. - If similarity tests are deemed to be "additional review criteria" then by the IEEE Operations Manual, our current checks are inappropriate. If these tests are considered to be "pre-screening" or a separate class (not stated in the guidelines except as "plagiarism check", which is NOT what we do in a thorough sense), then I assume that they are appropriate. "Text external-similarity" tests are not a serious check on plagiarism, and only apply at a [simple, trivial] level to text.

2. SUBSECTION 8.2.4..B defines plagiarism as you have stated, not as "text-similarity". This is a strong basis for your statement that "... text-similarity is "an additional screening criteria" ...". That applies to BOTH [self, external] text-similarity, noting that CrossCheck results do NOT ascertain plagiarism other than at the simplest level - external text similarity, without delving into more serious (core) plagiarism, such as the claiming of [concepts, theorems, derivations, data, results, conclusions]. While CrossCheck may be one of many tools of help to do the latter, a serious amount of [documented, defensible] work would be required to establish solid grounds for "core" plagiarism.

3. Is "text self-similarity" (that is, similarity arising from publications having and author list that includes at least one of the IJCNN paper's co-authors) a viable criteria for CrossChgeck [accept, reject]? If not, there is a BIG problem, as essentially all of the CrossCheck summary listing of results are essentially useless, as most of that is due to text self-similarity.

4. The IJCNN2019 CrossCheck analysis is NOT compliant with the IEEE's 4 "levels of plagiarism" (these are listed in the Authors' Guide : IEEE CrossCheck") and with some of their descriptions on how to do the checks, which are a tougher than the IJCNN2019 analysis. In a strict interpretation, this could invalidate all of the IJCNN2019 CrossCheck analysis.

5. One can argue that individual conferences should not be guessing at criteria and interpretations. If text-similarity is retained, then it would help if we had IEEE [normal, standard] (this list is incomplete, and probably somewhat erroneous) :
a) a hard-set list of criteria based on CrossCheck.
b) a "formula" that directly uses CrossCheck output for [accept, reject] decisions. I am not suggesting that this a "definitive answer", as many details may influence what is appropriate for a paper, but I do think such a formula can apply to the great majority of cases. Somewhat higher-than theoretical" allowable levels for the criteria are probably necessary to accommodate reality.
c) a strongly [enhanced, modified] CrossCheck that :
i. properly identifies [self, external] sources and their [individual, collective] similiarity %. I don't trust the max%Group" measure - which often seem to be a mix of [self, external] sources?
ii. properly estimates [combined, self, external] overall % similarity
iii. produces a TEXT listing of results with respect to all pertinent criteria that is easily copied to a spreadsheet of one-line summaries (or modifies the current summary listing of results for all papers to such a format, as the current summary listing is easy to [copy, paste].
iv. preferably it would distinguish between "peripheral" sections such as [Introduction, Literature review, description of alternative concepts and their basis, results of other authors], versus "core" sections such as the authors' [concepts, derivations, methodology, derivations, data, analysis, results, conclusions]
v. References should not be included
vi. It is unclear to me whether CrossCheck is effective with [tables, formulae, images] - I suspect not based on some tables and figures that I've looked at. Until it can properly assess those, they should be assessed separately, if at all.
vii. In the case of IJCNN2019, it was deemed important as to whether text similarity was "fine-grained", rather than consisting of "large chunks" (fuzzily defined, practically <1/2 page (1 full column of text)). I think that this is relevant.
d) I'm strongly in favour of publicly-listed CrossCheck analysis results, both so authors can see for themselves the overall results and how their paper "fits in", and for all IEEE-related conference CrossCheck volunteers, so that they can see how their work compares to aother conferences. This is predicated on using "paper submission numbering" (eg N-00000.pdf) which is not publicly associated with author names and paper titles.

In other words, what I did required an un-reasonable amount of effort to expect from any conference, and the

I will wait for any [corrections, additions, deletions] you may have before sending the above issues to the IEEE for responses. I will copy you on that email. Note that I don't think detailed work on the issues by myself will be adequate, and it would not properly reflect IEEE's [policies, guidelines].

Thanks again for your work and insights, Co-Author1.

+-----+
Subject: Re: IJCNN 2019 Paper #20406 Rejection Based on Software
From: Co-Author1
Date: Sun, February 3, 2019 3:42 pm
To: Bill Howell Cc: Co-Author2, Chrisina Jayne

In response to your request to compare all my past papers, I would like to discuss a case of Ithenticate in the past.

1. Without revealing the name XYZ in the case: The person XYZ used Itenticate to compared with all the papers but the similar papers are mainly from my group. He administratively rejected my paper based on only the output from IThenticate. I submit a complaint.
>> CrossCheck is based on iThenticate, and is the IEEE standard for text-similarity checks. I agree with your concern over the term "self plagiarism", which does NOT conform to common understanding of the term "plagiarism", and I should not have used the term "plagiarism" at all. I don't know the common Law in this area, but my guess would be that it could be the basis of a lawsuit.
>> My understanding is that it is a responsibility of reviewers to report suspected plagiarism. I assume that seldom happens, but I don't know. As this gets more contentious, to me reviewers would be ill-advised to make such suggestions, as I doubt that few recommendations could be sufficiently bascked up at the time to hold up in a court of law.
>> Without CrossCheck or other IEEE provided tool, we do not have an effective tool for measuring text [self, external]-similarity, even if there is a stated IEEE policy to reduce that (plagiarism is the word that they typically use in documentation, and that may be their mistake).
>> I assume without knowing, that these checks are not just an issue for authors and conference committess, but for IEEE as a publisher.

2. Why? As the key words (that do NOT have to be single words!) that iThenticate uses, the software reached a high percentage as the sum from the similarities values of all similar papers based on keywords. Because I was among the few who use the terms like "Autonomus Development", "Developmental Learning", "Developmental Networks", my paper reached a high percentage of similarity. If I were among those who do a subjects such as pattern recognition only, my paper would not reach a high similarity because iThenticate does NOT use common terms of pattern recognition (e.g., neural network or clustering algorithm) as key words to search for similarity so as to reduce the amount of time of running the software as you complained about.
>> I have a lot of questions about the internal workings of CrossCheck myself. For example, References don't seem to be included, but I can't be sure of that based on my own primitive estimates (based on the CrossCheck highlighting of text). CrossCheck isn't perfect, and doesn't provide key criteria for making a decision (we manually pulled these out via the CrossCheck reports). My gut feel is that it would correspond to a "reasonable care" level of assessment. In any case, I doubt that any conference volunteer has the resources to develop and test their own system.
>> The comment above deals with CrossCheck results and inner workings.
> From my own experience, CrossCheck seems to do a pretty good job of what it really does - estimate text-similarity, yet it does not distinguish between [self, external] self-similarity as is required. One other example I saw was a different system (NTU's system (I forget the name ?TurnitIn? - its on the blog), which gave lower results but similar text highlighting. Whether it is also based on iThenticate, I don't know.
>> CrossCheck is very FAST - it is the DETAILED MANUAL ANALYSIS that is slow! (1-3 minutes per paper without going into secoindary details, more time when that is necessary. Learning the [IEEE policy, CrossCheck], setting up [criteria, formulae] and other work takes significant time.

Let me give a analogy: If a later paper from Albert Einstein's is checked by IThentivate, the paper has a high similarity with his own papers simply because Albert Einstain was the few persons who do relativity!

3. I claim: iThentivate is NOT a FAIR software from a company whose power was not checked by the scientific community!
>> I would have thought that many scientists were involved in the development of iThenticate, and the IEEE's CrossCheck, but as to whether a scientific analysis has been done of the results, I don't know.
>> I look at iTenticate as a "fair and level" playing ground with quirks and some evolution to go. I can't say if, on the whole, the results are "fair". Here I think the issue of providing tools to authors to pre-check their papers is the "unfair" issue.

In summary, Ithenticate systematically penalizes a research leader. If he is among the few who use some key words that many others have not yet used, he cannot satisfy Itenticate because all the terms he used for his original research and made great process on are considered keywords for "plagiarism" of his OWN work.

4. By the way, "self plagiarism" is a misconception from persons who do NOT understand the meaning of plagiarism: The definition of "Plagiarism" (from Google): "The practice of taking someone else's work or ideas and passing them off as one's own."
Note: "Someone else"
>> Correct. I do refer to the phrase "self-plagiarism" as an oxymoron, and have converted to using the phrase "text [self, external] text similarity" instead.

That is, the correctly defined plagiarism MUST plagiarize other's work, not the author's own work.

By the way, the person XYZ misused iThentivate as a basis for rejection of my paper was removed from his/her position after I complained to a Committee with evidence. However, I do not know whether his/her removal is only because of my complaint.

5. I submit to you: A rejection of any papers based on a software only without at least TWO human experts to look through the papers is an abuse of administrative power, an academic misconduct, and a lack of due diligence. This behavior is an violation of IEEE Policy: IEEE Publication Services and Products Board Operations Manual
>> That's certainly not how text similarity checks are done now, now. Subsection 8.2.4.D "Guidelines for adjudicating different levels of plagiarism"

6. - The review process shall ensure that all authors have equal opportunity for publication of their articles. Acceptance and scheduling of publication of articles in these periodicals shall not be impeded by added criteria and procedures beyond those contained in the IEEE review requirements contained in this Subsection 8.2.2.A. (Co-Author1: YOU ADDED CRITERIA.)
>> Here I see "peer review" as distinct from 'text-similarity" checks. Does "text-similarity" qualify as a "pre-screen", which how it was used for IJCNN2019 papers, other than those for which iI was too slow in producing results. Whether "Subsection 8.2.2.A" takes precedence over CrossCheck tests, I don't know. Keep in mind that your comments about CrossCheck also apply to peer review, but it has a long tradition and is accepted, for now, but perhaps current peer reviews will have to change as well.
>> SUBSECTION 8.2.4.B DEFINES PLAGIARISM AS YOU HAVE STATED, NOT AS "TEXT-SIMILARITY". THIS IS A STRONG BASIS FOR YOUR STATEMENT ABOVE - "AN ADDITIONAL SCREENING CRITERIA".

- An article is considered in review if it passes the prescreening process and is forwarded to referees. An “administrative reject” refers to an article that does not meet the pre-screening measures and is, therefore, returned to the author(s) with explanation. (Co-Author1: YOUR IS ADMINISTRATIVE REJECT.)

7. - Organizational units that prescreen submitted articles, as enabled by this subsection, shall include a statement of the prescreening measures in the unit’s instructions to authors. (Co-Author1: YOU DID NOT PROVIDE SUCH STATEMENTS ON IJCNN 2019 INSTRUCTIONS.)
>> Correct - we did not include this in the instructions. We did have an "Authors' Guide : IEEE CrossCheck" (for which initial screenings were more severe than later screenings). But that was stated on the www.ijcnn.org web-site as being targeted for young authors and senoir authors without a knowledge of IEEE paper submission requirements. Few people probably looked at it. More importantly, many if not most authors do not have direct access to CrossCheck, which would tell them which parts of the text would have to be changed. This should be available to all authors, otherwise they really are "guessing in the dark".

8. - For all scientific articles and communications published in IEEE publication and information products, the Editor-in-Chief or another editor from the editorial board of the publication shall select at least two referees who are competent and have experience in the area of the subject matter of the article. Editors-in-Chief of a specific periodical cannot act as formal referees for articles being considered for publication in their area of responsibility of that periodical. (Co-Author1: YOU DID NOT HAVE ANY REFEREES for #20406.)
>> Again, I see a big difference between "peer review" and "text-similarity" checking. We did not have two "CrossCheck analysts" (not reviewers") per paper. Given the volume, we split analysis up between three people, and I subsequently re-did all analysis to have a [consistent, full] analysis for every paper >30% overall similarity. The two the General Co-Chairs make the decision, but I'm sure they don't have time to re-analyse the details, and one might consider them to be Editors-in-Chief in this case!
>> Volunteers will not likely want "text similarity checks" as a "permanent volunteer job", so failing payments to the IEEE to do it, almost everyone will be on a learning curve. The IEEE provides professional conference services of a broad range. Perhaps it will get to the point where this is required, in which cases I suspect registration fees will climb considerably, not to mention IEEE membership fees. Maybe I'm wrong there. The same applies to copyright checks and perhaps to other administrative tasks of the conference.

9. - Submission of articles to referees for “informal review” is to be avoided. (Co-Author1: YOU DID NOT EVEN GIVE INFORMAL REVIEW.)
>> CrossCheck analysis is not peer review, but it was a formal analysis using a standardized tool based on iThenticate, according to quantitative [measures, criteria, formulae].

I strongly urgent Chrisina Jayne to refrain from this abuse of administrative power of organizing IJCNN 2019 as the program chair person.

Very concerned,

-Co-Author1

+-----+
Subject: RE: IJCNN 2019 Paper #20406 Rejection Based on Software
Date: Sun, 3 Feb 2019 11:54:39 -0700
From: Bill Howell
To: Co-Author1
CC: Co-Author2, Chrisina Jayne

Co-Author1 & Co-Author2 - Thanks for your comments! These are well stated, and you have done work to generate your own comparison in a clear format. I know that your email is "private" as stated, but I really would like your permission to post it on the "Authors' Guide : IEEE CrossCheck" web-page, with your names and emails removed (as "Author anonymous"). Perhaps you, and others as well, will come up with other ideas for how best to address [self, external] similarities, and to quantify the criteria for equal treatment of authors.

Simple thresholds, even with separate regimes (in this case [<=30%, 30% < xxx < 50%, 50% <=] similarity), are, as you say "arbitrary, rigid, and authoritarian". Furthermore, there is some degree of variance in even CrossCheck's results, which is why some of the thresholds are higher than used for at least one journal in practice. I guess the other side of the issue is that at least we have a clear basis for the decisions, and authors can refer to all other cases to see that their own papers were judged by the same criteria, plus, for all conference papers, they can see the number of papers in each class and the results for their criteria. It is a real challenge in time to do the assessment for the papers. CrossCheck is fast but it is slow (boring) work to dig into the results and extract : the [self, external] max%Group; verifying that arxiv.org papers are by the same authors; and estimating the maxChunk size. It's really too slow and long to be practical.

The General Co-Chairs (Chrisina and Zoltan) make the decisions, whereas I just provide the analysis. It is worthy to note that you are a Bronze Sponsor of IJCNN2019, you have organised a (?workshop or special session? - I forget which), and you have been a prominent member of the community for some time.

I haven't heard much from Chrisina in the last few days. She's probably swamped with work, so don't be surprised if her reply is delayed. I am also a bit in limbo. After my head-on collision with a truck hauling two giant tanks of propane 11Dec2018, I am now doing a month-long test with a CPAP machine for sleep apnea. I'll have to go into Calgary in the next few days to fix things up with the supplier.

+-----+
Subject: Re: IJCNN 2019 Paper #20406 Rejection Based on Software
Date: Sun, 3 Feb 2019 12:14:38 -0500
From: Co-Author1
To: Chrisina Jayne
CC: Bill Howell, Co-Author2

Bill and Chrisina: I like to add: Please do not consider conference-then-journal and journal-then-conference different in terms of the later publication adds substantial new work to and casts new light on the material previous publication.

+-----+
Subject: Re: IJCNN 2019 Paper #20406 Rejection Based on Software
Date: Sun, 3 Feb 2019 12:00:12 -0500
From: Co-Author1
To: Chrisina Jayne
CC: Bill Howell, Co-Author2

Thank you very much for doing this difficult work for IJCNN 2019.

This is private, but please carefully consider the rationale behind my reasons bellow that is based on well-established practice in INNS.

(1) INNS society allows a journal publication to add 25% new material to an earlier conference paper. This is reasonable as the later submission not only add 25% more material, such new material casts new light onto the 75% of previously published material. Without such a policy, the 75% of the previously published material is crippled without being self-contained by the new paper. For this reason, I respectfully object to your statement "However, text similarity, even with one's own work, is still a concern to the Chairs (as it was for IJCNN2017 Anchorage)."

(2) Your statement "the primary basis of rejection was that the paper exceeds 50% overall similarity text content (but only by a bit)" is arbitrary, rigid, and authoritarian. Please kindly consider the above reasons.

(3) What I said "key words" include the 4-to-8 word phrases because "key words" in a journal paper are seldom to be an individual word. The key is that such phrases are NOT semantics, as I explained in my earlier email. I work on natural language understanding. You should not use such software for 1-to-8 word phrases to reject papers without a human expert to look through carefully before sending your rejection decisions. For this reason, I object to your statement: "You hit the nail on the head with that comment, although a threshold of 4-to-8 word phrases seems to be used, rather than just single words."

(4) We also carefully estimated the new text-based new parts for our IJCNN 2019 paper, not semantics I discussed above. For example, the introduction section and the theory section are new, not just newly drafted. Some experimental text paragraphs are also newly drafted. The space of new parts in our IJCNN 2019 (in terms of lines) over the space of the entire paper is 50.9% (378 lines/742 lines, including the references). This is in comparison to our Transactions on Industrial Electronics (TIE) paper that was an earlier publication. We did not know that it was put online when we submit the IJCNN 2019 paper but this is irrelevant because the 50.9% newly drafted space.

(5) To save your time, we attached the IJCNN 2019 manuscript marked with colors for you to check. The new parts are highlighted with yellow, green, blue and purple colors. Namely, all the color parts are new for this IJCNN 2019 submission and they amount to 50.9%.
(a) Green color: The Emergent-Context Emergent-Input Framework is the main theoretical and methodological novelty of our paper.
(b) Blue color: The emergent motor area setting in DN-2 is firstly proposed.
(c) Purple color: The new comparative experiments are performed to compare DN-2 with two different settings (handcrafted motor concept zones vs. emergent motor area).

(6) When we submitted this IJCNN 2019 paper, we did not cite the last TIE paper because we did not notice it was published online (on Jan 14). We apologize as we could have cited it even if it has not appeared yet.

Please kindly consider this case with detailed evidence above.

+-----+
Date: Fri, Jan 25, 2019 at 4:18 PM
From: Co-Author1

I have discussed with an IEEE EIC with regard to the program that IEEE uses. He proved to me: "All IEEE journals use iThenticate for similarity checking. "

IThenticate is a keyword search program only, not semantics. The percentage number that it reports is only about keyword similarity.

The threshold that you uses is further arbitrary, as even a highly original paper has a high similarity in key words.

Please provide the complete output from the iThenticate program if you still insists on rejection based on a software only.

However, it is highly unethical to reject a paper based on a software, without a human to check and prove plagiarism. I will surely subject a complaint to IEEE if you reject a paper based on your arbitrary threshold from only a keyword software such as iThenticate;

Please absolutely avoid doing this for any papers, not just our paper.

Typical preliminary response to authors whose paper was rejected based on CrossCheck similarity analysis

The email starts with a listing of which criteria caused the paper to fail. I should have always included the actual estimates for these criteria, but I did so only sporadically. The authors can check the "IJCNN2019%20CrossCheck.xls" sheet (see link below), and see the actual numbers used for all papers in the conference, and their "OK? calc" recommendation for [accept, reject] on the basis of CrossCheck only. Of course, papers must still pass peer review!

I am including the following relevant background information :

The CrossCheck report for your paper is attached.
My notes from my first-pass analysis are below. (This was ONLY included for papers >50% similarity, and for which I had done the CrossCheck analysis in the early stages of this process.)
Here is a summary listing of IJCNN2019 CrossCheck summary outputs, to which we have applied criteria as a screen, and after which further investigation is often required. Also, given the volume of work and my haste in doing it, there are numerous errors in the IJCNN2019 CrossCheck spreadsheet.
A [general, outdated, sometimes erroneous] description of the checks is found on the Authors' Guide : IEEE CrossCheck " web-page.

Notify me by reply email of any mistakes that you can point out in the "background information". Pay close attention to :

CrossCheck report - May show an outdated, earlier version of your paper upload before the deadline.
IJCNN2019 CrossCheck summary listing (IJCNN2019 CrossCheck.xls) :
%similarity CrossCheck - You can estimate your own %similarity from the CrossCheck report that I attached. See the "CrossCheck_variables" sheet, "Example level III estimate of '%similarity : Eyeball'". Doing it this way isn't accurate, and CrossCheck includes different things, but it can give you a feel for the measure, and it can alert us to "borderline cases" for which the [accept, reject] decision can go either way. For equal treatment of authors, decisions use the CrossCheck similarity.
maxChunk (pg) - This is a rough estimate of the maximum fraction of a full page that similar text from the same source (same color) occupies (not that contiguous text often goes from one page to the next). It's highly dependent on judgement, to say the least, but most of the papers were assessed using a VERY easy "<1/2 page (0.49)" rating.
max%Group [self, external] - I may have put the CrossCheck result in the wrong column, by error or because it wasn't easily apparent that the source was by the authors, and was not external.
arXiv same authors - The IEEE allows pre-posting of papers to arXiv by the same authors, and this makes a big difference to high-similarity, but that does not currently apply to all other. For IJCNN2019 specifically, other sites such as (hal.archives-ouvertes.fr, biorxiv.org, tel.archives-ouvertes.fr, www.aclweb.org) is allowed. Let me know if excessive similarity for your paper arises from one of these sites, or request that another site be allowed.

Based on your results above, let me know if you think your paper should pass the conference CrossCheck criteria based on mistakes in theoriginal table. Of course, peer reviewer approval etc is also required.