Bioinformatics for Discovery & Global Collaborations

By Adem Lewis / in , , /

It’s now my pleasure to introduce
today’s presenter, Dr. Yuri Quintana. Yuri is Director of Global Health
Informatics in the Division of Clinical Informatics of the Beth Israel Deaconess
Medical Center at Harvard University and he’s an assistant professor of medicine
at the Harvard Medical School. His research is focused on developing
innovative technologies and systems that empower communities of healthcare
professionals, patients and families. He’s developed award-winning global online
and mobile collaboration networks that have had a transformational impact in
pediatrics and cancer care. Yuri was a principal investigator in the HealNet
Network of Centers of Excellence, also headquartered here at McMaster
University in the late 1990s and early 2000s, and served as Director of the New Media
research lab at Western University in London, Ontario. And he’s also
held high-tech positions at IBM and
his expertise with us today to better Yuri has kindly offered to familiarize us with different approaches
to biomedical informatics and innovations in big data platforms for
biomedical research. We look forward to his insights and advice today.
Welcome, Yuri. Thank you very much, Diana and Marshall, and it’s a pleasure to be
with you here today to share some of the innovations that I’m doing with
colleagues not only here at Harvard but around the world. I’m very excited to
also be part of this network seminar series; I myself got my start in an NCE
that was also led by Diana Royce and others out of McMaster on evidence-based
medicine, so I have a very dear affection for all the great work that emanates out
of Canada and the great impact that it’s had globally. So thank you, Diana, for the
invitation today. I’m going to be talking about bioinformatics and its potential
for discovery and how it can advance our collaborations worldwide. So why do we
need bioinformatics? Well, let’s start first by looking at some of the
challenges that we have in healthcare. One of the things to note is
that chronic diseases are rapidly expanding all
over the world, and not only are they increasing, the costs of dealing with
chronic diseases is increasing at an astronomical rate and will be almost 48
percent of GDP in the next few years. Respiratory diseases and allergies are a
major contributing cause of these conditions. We also see that there is
also movement of people within countries, across countries, and across continents.
And so we have a growing diversity of genetics in the patient population pool.
Unfortunately, most research has that has been funded has been in high-income
countries, and hence the research has been more
biased towards the genetics of people in those countries. And so as we have a
growing population of diversity, we will need to find better ways to understand
how genetics can help us develop better treatments. We also know that
evidence-based medicine plays an important part in it. As I mentioned, I got
started in the HealNet network in Canada putting evidence to work in
application, but as we found both in Canada and the US and other countries
that evidence alone isn’t enough; implementing them in a way that can
reduce errors is very difficult. And as we start to try to translate research
into a bedside treatments, that process can take up to 17 years. So we need
better ways to communicate and do translational research, and so not
only do we need better data, but we also need better learning
networks and better ways to collaborate across different disciplines and across
countries. How do we actually do that? One of the challenges that we have
is that many places – hospitals and research centers and universities – have
incomplete data for any one particular disease, and this not only
because of the number of patients you have, but also the difficulties in
collecting data, inter-operability and data sharing. Also, a lot of centers who
are developing genomic data are finding it increasingly challenging to link that
clinical data with genomic data, and then to try to do it across institutions and
across countries becomes a lot harder. I will be looking at ways in which
we can tackle this in the next few moments. We also find that a lot of
centers don’t collect family history or environmental history and, increasingly,
as we want to optimize treatment, these will play an increasingly important role.
In order to go beyond what we’ve done up to now, we’re going to need to be able
to develop personalized treatments by tailoring them to genetics, the clinical
and family history, and environmental factors. To do this we’re going to
have to be able to collect vast amounts of data, integrate it, and make it
accessible and usable. What is bioinformatics? Bioinformatics is an
interdisciplinary field that develops analytical methods and software tools
for understanding clinical and biological data. It combines elements
from many fields, including basic sciences, biology, computer science,
mathematics and engineering, and many more. There have been several books
over the last 20 years, and journals that have developed in this field. The field
actually begins in the 60s, and one of the earliest sets of clinical
computings actually started around 1967 and became operational in the early 70s.
Our founder of our division was one of the first to have implemented that
– Warner Slack and Howard Bleich – here at Harvard Beth Israel Deaconess Medical
Center, and in other centers across the United States and Canada. Many of these
systems expanded not only within hospitals but increasingly trying to
reach into the home, so we saw consumer health innovations, telehealth; we started
creating more advanced ways of collecting data at a regional and national
level, and that has become a field unto itself called public health informatics.
Going the other way, in terms of basic science, we’ve also developed more
advanced ways of collecting research data and linking that to clinical
informatics, and that has led to an an area known as research informatics. So,
within biomedical informatics there are many related
fields which overlap, and if we look at some of the applications that have been
developed we see that, increasingly, there are
many such systems across these different areas, both in the clinic, at home, at a
population level, or in the laboratories or research centers. So the challenge
is: How do you take data from these different systems, integrate it and make
it available in a way that you can actually analyze and create longitudinal
studies? Well, there have been several key developments that have happened at many
centers; some of the things that have been developed at my Center here – the
Division of Clinical Informatics – have been: one of the first ways of collecting
data directly from patients, and actually this happened in 1967, and the first
photo you see there is Warner Slack. Actually today, we celebrate 50
years to the day when they wrote an article in the national press and he was
on national TV explaining how it’d be beneficial to collect data directly from
patients, thus freeing the doctors to be able to do more advanced work, and
have a richer his family history. It’s ironic that 50 years today
we’re still trying to implement this vision, but we’ve made progress. We’ve
also developed some of the first ways to link to literature data and this, for
biomedical research, is very important, as you’re looking at clinical data, to be able
to pull up the relevant clinical studies and analytical tools. So paper chase was
developed here and eventually contributed to the development of PubMed.
We also develop some of the first medical record systems. But equally
important, we’ve also developed ways to connect patients to their babies in NICU’s, patients to their own medical records with PatientSite, and more
recently, with InfoSage, being able to link elders with their family members
who are caring for them through online and mobile networks. And we believe very
much, in this Center, that families and patients can play a very important part
in providing necessary data to get a more complete, accurate picture of the
healthcare landscape and produce not only better treatments, but better
research. So why is it so difficult to collect all this data? Well, in part it’s
because there’s a lot of data and there’s also lots of different ways to
classify that data and organize it. The problem isn’t
that there aren’t standards; the problem is that there’s too many standards and
we can’t quite always agree on which standard. And even within a standard,
there’s different ways of implementing the codification of data within that
standard. There’s also highly incomplete data and
so no Center has the complete medical history of any one patient. So we’re
dealing with highly fragmented data, codified at different levels of accuracy,
usually distributed across multiple healthcare systems. And, as we develop
research databases, we have to deidentify, and it’s very difficult to completely de-
identify that data. So there are different degrees of deidentification. Here you
see a nice framework developed by the Future of Privacy Forum that shows
different approaches of deidentify and degrees of it. One of the big
pushes that has happened in bioinformatics has really been the
advancement of technologies that allow us to analyze the human genome, and there
was a major initiative in the United States, as well as several other
countries, in the 1990s and early 2000s which really, for the first time, mapped
the human genome- at least, a first draft of it, which enabled further analysis. And as these technologies become more advanced, we’re
able to analyze these data at a deeper level: not only at the DNA level,
but then down to the proteins. And so you see a bunch of technologies such as DNA
sequencers, DNA microarrays, and mass spectrometers: they’re creating
tremendous amounts of data, and each year there is a new generation of data with
more micro-granular data, and so the challenge is to not only how to manage
this data, but to be able to integrate data that has been analyzed at different
levels of complexity, with different generations of technology. And the price
of this technology continues to drop dramatically, and so we’re going to see
even more data. The problem is not that we don’t have data; the question is how to
analyze it. So let’s move to platforms and networks that are trying to do just that.
There has been quite a bit of talk by a group here in the United States, and
other groups around the world -the Institute of Medicine, and now it’s
called the National Academies-to develop something called the learning healthcare
system. I’m working in this area but at a global level, and this really tries
to have a continuous cycle where we analyze data; we develop methodologies
for treatment; we deploy at local centers; we learn from that data; we
discuss the data, not only locally, but then aggregate that data and then give
that back into the community. And as a community, pool our knowledge and be able
to learn from that and develop a new generation of treatments. And while this
has been going on for many decades, how to operationalize this at a large scale,
even within an institution or across institutions, becomes quite challenging.
What you need is very good technologies for being able to share
data and train and educate people on the methodologies. For doing this, I spent 12
years at St. Jude Children’s Research Hospital in Memphis Tennessee developing
tools and platforms for pediatric cancer. Two of the big tools that I was part of
leading the development of, was, a global platform for sharing
education and sharing protocols, and a data collection tool called Pond4Kids that
allowed clinical protocols to be shared and data to be collected from different
centers, aggregated and analyzed in anonymized formats. Several collaboration
networks – in fact, 400 different groups formed, and over 200 of those were
clinically oriented, developing shared protocols, and over those 12 years many
of the countries were able to raise their survival rates by 20 or 30 or more
percent in their area. So collaboration is essential, not only in developing
countries but also developed countries. The development of these platforms is
very difficult. About the same period that I was building
Cure4Kids and POND, another initiative called CA BIG was developed
in the United States. Millions of dollars were spent on this; unfortunately, while there
were a lot of individual tools developed that were state-of-the-art, the integration of
these tools did not work very well. So after many years, in 2011 they decided
to discontinue this platform because there was just not enough coherence in
the system. So this is sort of a lesson learned, in that the design of
this has to be done in a way such that there is seamless integration of the
tools. More recently, the American Society for
Clinical Oncology has developed CancerLinQ, which is trying to do similar
things such as POND for kids, in that it’s sharing clinical data from various
centers and being able to aggregate that to be able to create a learning system
within oncology here in the United States. I more recently have been
developing something called Alicanto, which is a learning system that combines
both education and access to clinical tools for collaboration, and this
tool essentially tries to create a seamless environment where you can
access all the education and the data in one Center, and create online
communities where people can meet online to discuss and share and develop their
protocols or their research projects together. So the problem isn’t that
there aren’t individual technologies to do this; the problem is that they’re
usually fragmented and it’s very frustrating to learn ten different
systems to be able to participate in your research community. So this is an
integrated system. There are two networks that have launched in the last two years.
One is called Open Pediatrics in Boston Children’s Hospital, which has
over 20,000 users already and a growing amount of educational content and online
groups that meet, and the more recently launched MADCAP Network, out of the Dana-Farber Cancer Institute, that is a collaboration of many institutes here in
the United States, England and universities in Africa, and this is
studying the genomic basis of prostate cancer,
to develop new advanced research. And I’m very excited for both networks because
both of them involve high degrees of collaboration and global data sharing.
There are some tools that have been developed to advance the sharing of
genomic data. One of the more famous is I2b2, developed here at Harvard
and with other colleagues across the United States, and a more global
initiative called Transmart. These two have now sort of merged and this allows
you to share lots of clinical data and create ways to aggregate
that and share it among different groups and install analytical data. There are
several organizations that have been formed using the Transmart platform. One
of the largest one is eTricks, funded by the European Union, which is a
combination of both universities and private sector pharma companies working
to aggregate data and work together on biomedical discoveries.
There are also large datasets that are becoming available for sharing and
advancing research. One of them is the NIH Genomic Data Commons, which is
available online. We also see things such as the bio-banking resource developed in
Europe, which is has several member countries, and they’re developing
electronic informatics tools as well as creating databases that are shareable.
Here in the United States, there are several initiatives for creating
repositories for data. So, after large clinical studies end, there isn’t a
really convenient place to archive that data and make it available for reuse. So
several initiatives are underway. One of them is the Big Data to Knowledge
initiative funded by the National Institute of Health here, and they have a
growing repository and they’re working very hard to come up with ways
to data curate this, and index the contents, so that it makes it more easy
for people to integrate it into other repositories
for reuse. Genomics England has a large initiative as well and you’ll see here
that they’re putting, like many other groups, a lot of emphasis on how you
curate the data, validate it and classify it in a way that can be used by by others.
The eMERGE network has been funded here in the United States; it also has
several hospitals working together to create shared DNA biorepositories that
are linked to clinical histories from the electronic medical record, and there
are several initiatives underway using that data. More recently there has been
an initiative here called “All of Us,” led by former Vice President Joe Biden, and
this is trying to create a cohort of a million people of diverse backgrounds
here in the United States, not only to look at the clinical history, but to look at
the environmental history, lifestyle and biology and to combine these
three factors to be able to create a more rich resource for cancer research.
The US Veterans Affairs has a very large program as well, trying to
collect DNA from their members, and several clinical studies have been
created. One of the challenges, even though that many of them use the Vista
medical record system, is that they’re migrating to Cerner, and even each
implementation of Vista is slightly different, and so they have been facing,
like many other people, the challenges of how to aggregate data. In China
there’s a very large – there’s more than one, but this is one of them – BGI, which is
developing a large Research Center and creating a large gene bank being used
for advanced clinical research. So how do you build these networks? These are very
large networks and require a large number of people. Here are
some of the roles that you need to have within these networks: chief scientists;
you need a research informatics director that understands how the information
needs to be organized in a way that can advance the research; you need
someone also on the clinical side that understands integration from the
clinical side, and both of these need to actually coexist.
You need a biomedical platforms architect that can bring these two sides –
the life sciences and the clinical side – together.
You need programmers and engineers, of course. Key to that is people with deep
understanding of taxonomies and classification. You need a very good team
for assuring the quality of the data and then, once you are able to assemble that
data, then you need data scientists, machine learning experts, to be able to
create the analytical tools. Very key is also good people that know how to create
good user interfaces. Some of these systems are very powerful in terms of
functionality, but very difficult to use. Typically these networks don’t
put enough energy or funding towards usability of these tools. It’s
important to have good ethics and privacy directors, that communicate both
internally but also to your stakeholders, and patients and the public, to know that
there’s trust and how the data is being used, archived and secured. To do this you
need to have continuous training and you need a very strong set of cybersecurity
experts, cloud computing, and to be able to
sustain it you’ll need an external partnerships director. So as you see
there are many different roles and this is just a partial list. To do this
you need to have a fairly long view and be able to hire the right skill sets. One
of the challenges in trying to bring this to life – well, part of it, sometimes,
is that some of these networks lack of focus. They try to do too much too
quickly, and so they’re feature rich but function poor: lots of features but
none of the features actually is at a great enough depth. So you need to have
an initial focus that allows you to achieve a certain level of initial
success, and to be able to scale. As you see, there are a lot of complex skills
that are needed, so you need a very highly trained personnel that’s
multidisciplinary. You need good governance that respects all the
different disciplines and can create a cohesive
environment, and that requires strong leadership. Funding is a problem, because
it’s a lot, but it’s usually not the biggest problem.
My feeling is usually it’s the people problem: being able to get the right
combination of team that can work together. It’s important to be able to
collect consents upfront, especially if it’s going to be shared with people
outside of the network, and be able to get permissions to be able to reuse that
data. And there has to be strong data use agreements between institutions. One of
the things that often been overlooked is diversity of patient sets,
which makes the findings limited to the type of diversity that you have in your
gene pool. There has to be an agreement of what are the taxonomies and
classifications to collect. This is one of the major challenges: we’re trying
to integrate data from different sources, so there has to be a roadmap to be able
to do that. And as I mentioned before, the deidentification of information is
important. It’s never completely fully deidentified, because there are ways to
sort of be able to identify it, especially with rare diseases, so there
has to be a very clear level to do that. And more importantly, as or as important
as anything else, is the sustainable model. If you’re looking to be
able to partner with Pharma and other groups, and be able to extend, that
has to be right part of the initial design: everything from consents, to how
this data is stored, especially where you’re storing data from multiple countries; in
some cases, some datasets must remain in a country because of national law, so
then you need a distributed cloud architecture to be able to do that. So
there are many places where you can get training on this. There are graduate
programs in bioinformatics, clinical informatics and bioengineering.
Unfortunately, there aren’t enough of these programs to feed the growing
demand that there is. In this case, it’s, as you saw, a global need for
this, and it takes quite a bit of time to graduate people from these programs. And
it’s hard to keep faculty in these programs, because there’s a great demand
to move to the private sector. As well, there are some great societies such as the
American Medical Informatics Society and IEEE that have many journals
and conferences in this area, and there’s a growing number of specialty boards
that you now have in the United States. There’s now a clinical
informatics specialty board that is fairly new, and you can specialize in
that. There’s also one for nursing and then there are other ones for imaging
informatics as well as health information management. So I think it’s
important to see people who have those, or to send people to these programs, and
we’re very much of the belief in my division, and in many other places, that
education is really the cornerstone of the success of programs. Because you need
not only a good base of knowledge, but you need to keep up with this, and so
we’re very much involved in training both here at our university, as well as
on international projects. So what’s a roadmap to be able to achieve some of
these things? Well, first I think you need to have some keen leaders that
understand the global landscape and challenges involved, and the kind of
needs of you’re going to have in terms of planning and assembling a strong
multidisciplinary team. You need to, I think, also involve patients as true
partners from the beginning, and not halfway through, because they’ll offer
you insight into things, research questions, that they’re actually
concerned about, but also advice on how they’d like to be respected, and how their data
is being collected. I think you need to start focused, with a plan that’s
feasible initially but is designed to scale both in data and the ability for
you to answer a wide range of research questions. And above all, I think you need
to have very clear transparency, both internally and externally, about what
you’re doing with this data, so that everyone is supportive of what you’re
doing. Here are some references to some of these things -I think the slides
will be available afterwards – both in bioinformatics, sources for training, and
some of the data and networks as well, and some definitions of the related
fields that are increasingly overlapping and all contributing towards developing
this. So as hard as this is, I’m an optimist and I think in the next few years we’re
going to see more and more of these networks, and as we have a richer data
set, I think this will not only benefit high-income countries but
hopefully will partner with developing countries as well, as true partners, and
help develop and generate a new generation of researchers, both here in
our countries, but around the world. Thank you very much.
Back to you, Diana. Thank you, Yuri, for that gallop through bioinformatics 101. It was
extremely informative and very eye-opening for me, for sure. This is
the time that we open the floor to questions from today’s webinar attendees
and I would encourage you to take advantage of Yuri’s expertise during
this period of time. Your questions can relate to the topic of his talk:
informatics platforms for big data in health research, but Yuri also invites
you to ask questions about his broader areas of interest, including: mobile
health apps, wearable technology, sensor based applications for patient-reported
data, serious games for health and wellness, and the implementation of
online networks connecting patients, families and researchers. I’m going to
kick off. Yuri, this is obviously a very expensive undertaking. You mentioned that
there’s projects that have invested a billion dollars and then they didn’t
work out, but in terms of starting small, so that can you can scale your project
to a national or a global level, what are the three most important things to get
right before you start to scale? If you were just starting a bioinformatics
platform, of data from a research project, for example, what are those core things
that you have to work the kinks out of before you can actually scale? So I think
one of the key things is first agree on an initial research focus, and having
been part of an NCE, I know that it’s sometimes difficult
when you have a very large number of researchers who have diverse points of
view, and I think, given, you know, getting funding from any research agency is
highly competitive, and these are expensive initiatives, I think having an
initial research focus that allows you to implement something well, but with the
knowledge that you’re going to expand, to other types of data and other types of
research questions, I think is important. I’ve seen this done well and I’ve seen
it done poorly, as well, where, in some cases, the focus hasn’t expanded and
they’ve lost people within the network because they weren’t interested in that,
or people who started too broad and weren’t able to achieve any level of
success. So that’s a judgement call, initially, of what is the initial
focus. But having some transparency of what the roadmap is becomes important. I
think the second part of it is starting to have an agreement on data formats and
standards and coding. because the data will come in lots of different
classification methods and aggregating is is always painful, but there has to be
sort of a roadmap of how to make that easier as you move forward. So that
might mean that certain centers might need to start classifying content
in additional ways; that doesn’t mean they have to stop how they’ve done
certain things, but to make it easier to do that. So having an agreement on
those taxonomies and classification, and budgeting time, and having the patience
to go through that, I think is important. If you don’t do that well, then it’s very
difficult, if not impossible, to actually merge your your data. And then I think
there has to be a fairly transparent way of what are the benefits for these
networks going to be, and how those benefits are going to be shared.
And those benefits include, in part, publications. And so, I
think, in networks that I’ve seen done very well, there’s clear agreement at the
beginning of how projects will agree to authorship. In the MADCAP Network, one of
the things I like the most is how well and respectful the whole process has
been, and how they treat their African colleagues as true peers and
co-authors, and there’s a long term commitment to develop their skill sets
as investigators and give them give them leadership opportunities. So I think
having that trasnparency, whether its publication, economic, intellectual
property – I think sometimes institutions get very aggressive in trying to secure
intellectual property rights and income streams, and things like that, and if
every one gets too rigid about that, you’ll never be able to get to a point
where you can actually collaborate and get the broader benefits. So having
transparency and having flexibility at the institutional level to create
these multi-institutional organizations. And I think both at a people level and
an institutional level, you quickly realize who are the true collaborators
and who are the people that are really maybe are best to work alone. It’s a
certain personality, both at the individual, organization and
institutional level, to be able to create these networks. But
truly, if you want to tackle some of the big questions, you’ll need access to larger data sets, and that requires collaboration
beyond your your own institution. Excellent, thank you very much.
Marshall, do we have any questions? We do, indeed. We have four questions. All of
them have been submitted in writing, so I’ll read them out, starting
with the first, which comes from Dean Befus who wrote: “You mentioned
industrial partnerships. Can you speak to the challenges involved, and to some of
the successes that have occurred in this area?”
For many universities, it’s a it’s a challenge, in that, you know, the basic
premise at a university centers around academic freedom, and the
corporations have a mandate around maximizing shareholder value, which is
sort of a competitive edge over that, and so there there tends to be
sometimes this difficult divide of: how do you balance openness versus
the need to protect income streams? Increasingly I think companies are
realizing that even though there are multibillion-dollar pharmaceuticals, that
they don’t have all the data they need, nor the expertise, to be able to analyze
it. So I think there’s been a change over the last ten years in a growing
number of companies that are willing to enter in collaborative agreements
with universities and to be able to share income streams and publication and
other – so I think it’s important for universities to maintain their
integrity and transparency and lack of bias. I think it works better when there
are multiple companies involved in a consortium with multiple universities, in
that there’s more of an even playing field for all the stakeholders involved,
but I think, above all I think if there’s a high degree of transparency, high
degree of research ethics, it is achievable.
I think some Europeans have been able to achieve this better than in the US, but
in the US there’s more activity. I think both sides have to
be able to get accustomed to each other, because
there is a different personality between both organizations,
but I think there’s a growing movement to create those, and I think so long as
there’s a high degree of transparency, I think it’ll be done very well. Thank
you for that question, Dean, and thank you for the answer.
Next question. The next question comes from Amrit Singh, who wrote: “Which
programming languages are most commonly used in bioinformatics: Python, Java
C++? And how useful is it to have a web development background (full-stack, node, js, Django, etc.) So the answer is: yes. They’re all used.
There is a very rapid development of new tools; you know, ten years ago R really
wasn’t in the scene, now it is, and Python, and both of those are now major
forces in the area. Ten years from now, there’ll be new tools as
well, so I think today definitely all the ones you mentioned are important. I think
it depends on where your passion is and where you want to spend your
time, and because different tools do different things better. For example, some
are better at the analysis; so certainly, if you’re into the analytical outcomes
R and other tools are particularly important. If you’re more
interested in the infrastructure part of it, then full-stack and those become more
important. The development of these systems are so large that really there’s
there’s a great space. One area that is pretty weak is good user
interfaces. If you’re looking for a niche are,a that will be increasingly
valuable. I’ve seen lots of products, both commercial and institutional, that never reach their potential because people
couldn’t figure out the interfaces. Being able to understand one other
thing is workflow, which I didn’t mention, but how you go curating data
from one set to another requires a set of tools to be able to do that. So really
you have to look at where your passion is, in that; try a few
different things and then see where you want to spend your time and then be
willing to continually keep up. There’s a new generation of databases
NoSQL, for example; it’s a different type of architecture for large
data sets, so that’s another area that is also very high in demand: high
performance database architectures as well, and in memory databases. Great. Amrit,
thank you for that question; Yuri, thank you for the answer.
Marshall, next question? The next comment and question comes from Eduardo Reyes
Serratos who wrote: “Wonderful talk. At the beginning of your talk, you spoke briefly
about the flow of different populations and how they are distributed
in the world. Given that the world is becoming more
multicultural, I want to know if you have any thoughts on collecting clinical data
from populations where 1) access to medical care is difficult and 2) access
by medics to databases is a challenge, given that they may not have the
necessary training to handle them yet. That’s a great question that’s very dear
to my heart. I think you know outside of high academic medical centers within the
G8 countries there’s a challenge to get good detailed data, so the academic
medical and research centers are
both hospital and or Research Center and they have large amounts of
staff to collect more degrees of data, but the population is not all
living near an academic Medical Center, and so you get vast numbers of people
that are underrepresented in most studies. Even in the United States and
Canada, for example, indigenous Native Americans and Canadian First Nations
people; Hispanics have been traditionally underrepresented, and
then as you start to go globally into Africa, into Asia, they are even
more difficult. Now the the positive thing is that the technologies for
collecting data analyzing, and even doing screening, is becoming increasingly less
expensive, so I think what needs to happen is that we need to start
developing training in these other countries so that people in
bioinformatics and best practices in data collection and analysis. That’s
one of the things that MADCAP is doing; the data is
being collected in countries and there’s a strong educational component. I
think when we do that then we start treating our international partners as
true peers and partners, rather than parachuting in when we have a grant to
collect sample data, and then leave and leave all the equipment behind. Many
countries felt quite “burned” by these previous studies, where some
genetic data for a particular interest has been collected, but there has been no
long-term planning. Increasingly there are more research groups that are trying
to develop a more sustained vision for how to do that research, and I would
encourage governments, and people to write in thier grants, the need to
be able to allocate more sustainable funding for international
projects, because by helping others you’re helping yourself, but you’re also
helping them, and I think it’s the humanitarian way to do it. It’s clear
that by the increasing migration of populations,
through refugees or through natural immigration, we are going to have a higher diversity of genetic pool in most cities around the world.
So I think it becomes really important, and I think tied to that is
responsible conduct to be able to educate patients, especially people who
are illiterate, or low literate, to be able to understand what
they’re contributing their data to, and so I think including social scientists
and anthropologists to be able to do this in an ethically proper way is
important. I think this is a key area that more people should be focused
on. I’m an optimist I think there will be more people involved in this type of
work. Thank you, Yuri, that’s a great example of why the team itself needs to
be multidisciplinary so these different perspectives can be taken into account.
Eduardo, thanks for asking that question. Marshall, are there any additional
questions? Yes, there are there three more. The next one comes from Yanna Ma, who
wrote: “I’m an immunologist. Is it possible for one to get a self-directed form of
bioinformatics training? Can you recommend any such courses? How can one get started in the field?” So there’s an increasing number of free courses.
Some of these networks that I mentioned on their portals have online education.
There are books in the field as well and I cited some of them as well. I think
it’s also eye-opening to go to a conference and network with other people
and visit other labs. I think one of the best ways to learn this is actually
join an existing lab, and so if you have an opportunity to spend four months or a
year or get a degree it’d be useful, but if you don’t have time for a full
degree to visit another lab. Our lab, for example,
accepts visitors and I know other ones, from many working four months or more to
do research, and it’s a really good hands-on way to learn. But there’s an
increasing number of courses available online. And even if you get a
degree you pursue these online education and continuing education
opportunities because the methods and tools are changing literally every day.
Thank you, Yuri. And so for all the AllerGen researchers and trainees that
are online: if you’re interested in a lab exchange with Centers of Excellence
in bioinformatics, talk to your supervisor and then speak to
Leah Graystone at the AllerGen administrative center, because we do have
a program that will provide support for those kinds of capacity-building
opportunities. So thank you very much, Yuri, for for the answer to that question.
Marshall, next question? The remaining two questions are both from Dean Befus. I’ll
read them one at a time. The first of those two is: “Can you review some of the
national and or multinational networks for birth cohorts with broad integration,
including clinical data?” Birth data is difficult to to get.
Well, all data is difficult to get just because of its fragmented nature. There are
research networks that have been doing this. Here, for example, at Boston
Children’s, there’s several investigators involved in certain centers, most of
those… I’m not aware of something as large as the kinds of networks that I’ve
shown here, like at the million person level. Typically, children’s research is funded at
a lower rate than some of the adult diseases, but there are
individual NIH-based studies and some of those repositories are
being made. If you look at the database link that I have for the efforts that
the NIH and National Library of Medicine are doing, they are trying to make it
easier to discover those databases, and they’re accessible through … so go to that slide and click on there.
Patricia Brennaman, who is the head of the National Library of Medicine and
who’s also a nurse and an informatician, is doing a stellar job at logging from
resource agencies the fund and increase in that environment. So I’d start
looking there as well. Okay, thank you. Next question? Before I read Dean’s last
question, I see one of our attendees has raised his or her virtual hand so I’m
going to open the mic for that person to speak and I apologize I may mispronounce
your name: Sze Man – I think, hopefully is a
approximate pronunciation – I’m going to unmute you now so you can pose your
question. Please, go ahead. Hi, that wasn’t bad for my name, thank
you. My question, actually, if I may, relates to your broader interest in
health apps and health devices. I see them as valuable research tool
because it can provide so much data and you know health devices nowadays can
measure blood pressure, heart rate, even lung function sometimes, but I was
wondering how many of these health devices are actually
clinically validated before we can use them for research, and if they’re not
validated, do you think that they need to be more rigorously validated before
we can actually use them in research? That’s a great question.There
is in the United States a group within the FDA that is validating those devices,
if they are being used for diagnostic or therapeutic use, and so far there’s a
handful of them that had submitted their data and I think there were a few
approvals. It’s a fairly new initiative. There is quite a bit of
variation between consumer level and clinically rigorous, and I know that
some groups have actually just taken to their own efforts locally to
test different devices, but increasingly these devices are becoming more accurate
and they will become more affordable, but there’s a handful of them. If you
email me, I’ll send you a link to a few of them that I know. I think Europe
is trying to develop similar standardization. One of the challenges is
that to get it clinically approved takes a lot of time and effort, and to collect
data, and a lot of device companies, especially the startups, don’t have the
money to be able to submit it. There may be some things that haven’t been approved
that are actually quite effective, so it’s still a “buyer beware” that you
should do your own due diligence to validate them as well. All right, thank
you, and Marshall, your final question
submitted online? Yes, this is again from Dean Befus and it’s related to the last question,
to some extend, and he wrote: “Some publicly funded healthcare systems have been
hesitant to pick up privately developed patient apps. Do you have any advice on
standardization of such apps and their acceptance and integration into publicly
funded healthcare systems?” I always like to work at the cutting edge of
technology, and the things that are disruptive, and it’s always been a
challenge to get a lot of those innovations accepted into the main fold,
but at the same time if people didn’t work in that space we wouldn’t actually
make progress. It’s a reality that if you always enjoy the
cutting edge, you’re always going to find resistance. So I would suggest anyone
who’s in that space just to have long term resiliency. The way you get
more acceptance is 1) to be able to share research studies where they have
been able to validate it. I know that one of the major questions that a lot of
hospitals and other systems have is sort of the sustainability… So apart from
just the accuracy of devices, is how to integrate them into clinical workflow
and the cost and security to connect to do real-time data sharing between that…
Now, there is a technology in the United States called Smart-on-fire which
is making it easier to connect apps to medical records, so you write
once to that fire standard and it connects to several different medical
records, so it is becoming easier to connect and I think you have to sort of
make the really strong argument of what is the business value, not just the
healthcare value, that people improve health care outcomes, but there are
always bean counters at any organization, and better care with
better outcomes usually leads to lower savings, because there’s less
readmissions and better quality of life for patients. And so being able to
quantify what is the return on investment, not in the healthcare sense
but on the financial side of it, and so partnering with people who are from the
business school, who are with public health schools, to do the analysis. Lastly, I think there’s just a sort of ideological resistance to change, and
that you need to be able to find change champions within organizations
that are willing to keep going through the process of getting approvals.
We always in our division make our graduate follow through projects and of
course they usually have to go through a committee to get approved; sometimes, if
there are lots of concerns, you get relegated to a subcommittee, and so we
always tell them this is where your real training begins, is after you learn to
get through the committees. But persistence is important, having
facts, and tackling not just the healthcare outcome to a financial and
long term benefits to do that, and I think over time people will
migrate – once they see other organizations successfully integrating
these smart sensors, mobile apps and other things, and seeing better
quality results, you’ll see the success. I know that there are, for example, serious
games for health for asthma to help people, with medication management; tools
are being used for symptom collection at home; and so increasingly, I think, people
will realize that 1) it makes not only healthcare sense in terms of outcomes,
but it also makes financial sense. The final resistors will
will give up when there’s more data on all sides to convince that this is the
right way to go. Well, thank you for that question, Dean,
and Yuri, we’re approaching three o’clock here at McMaster University and our hour
is nearly up. I feel like this afternoon we just
skimmed the tip of the iceberg of your experience and expertise in this area,
but your passion for informatics and its transformational potential to improve
health outcomes, through better understanding and access to health
information, has really shone through. I think we’ve all learned a great
deal, so thank you so much for joining us.

Leave a Reply

Your email address will not be published. Required fields are marked *