Category Archives: IT Architecture

Uncommon Thinking…

From the Flickr stream of Bre_Pettis

From the Flickr stream of Bre_Pettis

 

I was chatting with a colleague about the new EDUCAUSE slogan, “Uncommon Thinking for the Common Good” when I realized that the saying encapsulates one way to think of my work as an I.T. Architect.  “Uncommon Thinking for the Common Good” is what I try to foster in the teams that I work with.  I’ll explain this in two parts “Uncommon Thinking” and “for the Common Good”.

Uncommon Thinking:

I try to break people out of their daily routine and their comfort zone.  For instance, I have sat in meetings where a team is supposed to develop a new user interface (UI) for a new application.  I’ve watched as team redraw the UI for the old application, that they use day-in and day-out, as the solution for the new system.  I’ve also seen teams “re-think” how a business process could be done.  The end result was an automated version of the current process.  The new implementation of the old solution substituted emails for people running around with paper.  They are following the same steps, replicating the same authorizations and sending the same forms often without asking “why this form” or “why this person” or even “is this necessary at all”.   My job is to get them to question their old ways of doing things.

People like what they know.  They understand what they use daily.  But advancement comes when we change and disrupt routines, not when we replicate them into a new technology.  You have a telephone book at home with White Pages for people and Yellow Pages for businesses.  Changing that into two Word files you can print doesn’t bring great advancement.  It might be easier to carry only the pages you need but that doesn’t really improve the process.  Search capabilities are a big improvement.  Rethinking how you use the information, such as mapping businesses onto maps so you can find restaurants near your hotel, that brings advancement.  The routine of grabbing a book and looking something up is thrown out.  The new routine is to grab a laptop, look for wireless and Search.

I often introduce myself to new teams saying that my job will make them uncomfortable because I will ask them to throw out what they know and what they are comfortable with.  I tell them I will challenge their assumptions.  I say this not because their assumptions are wrong but to make sure their assumptions are correct and we accept them for the right reasons.

I love the fact that the Web 2.0 explosion is going on.  There are so many examples of “other ways to do things”.  I bring these examples and ask, “why can’t we do this instead?”  I show them Netvibes and ask, “can we make our pages this flexible?”  I show them Etsy’s Find By Color page and ask, “can we make creative ways to search like this?”  I show them The Northface catalog and ask, “should we have filters to help people search like these?”

 

Etsy Color Browser

Etsy Color Browser

 

 

It’s not that I think we should have a UI that looks like any of these sites but I want to break the team’s mindset and get them to start thinking about all of the rich possibilities.  I want them to work with a blank canvas and a rich palette of colors.  I want them to really get imaginative in their solutions to the problems.

I had a watercolor instructor that I worked with at UC Santa Cruz.  We were painting in the woods one day.  Everything I produced came out flat, boring and uninteresting.  They were awful, actually.  I was having a terrible time.  He came by, had a look and asked how it was going.  I grunted out my disgust.  He said, “Give me three paintings, but you can’t use any browns or greens at all. No earth-tones.”  I’m sitting in a forrest of browns and greens.  I was forced to paint purple and blue trees and red ferns.  At first it was very uncomfortable and I was very hesitant.  The first attempts were also awful.  But then, it became fun and playful and the paintings improved.  I was forced to let go of “how it is” and instead I had to play with “how it could be”.

That is the uncommon thinking of the Architecture practice.  Letting go of the how it is and thinking about how it could be when we start with a blank canvas and rich palette.

For the Common Good:

The other aspect that I deal with on teams is the narrow focus of their solution.  Often, the solutions that are put forth solve the very local needs of the group of people sitting around the table.  My work is to ask, “how does this fit with the broader issues that the people deal with daily?”  “What does this solutions do to actually help people?”  “What impact will this have on them?”  Not all solutions should be broadened and generalized to solve a larger issue but we should consider their larger impact. 

Every application must fit into an already rich application environment.  No application is truly a silo-application anymore.  Someone has to use it.  That someone already has a username and password if not several.  That someone already has a day that is full of tasks and applications.  That someone has things that don’t work so well, things that they are comfortable with and things that they cherish dearly.

The impact assessment of a new solutions should consider all of those people that the solution will effect.  If the new process changes their lives from reading paper documents to reading email, the users might not consider it an improvement.  What if reading the paper documents is what they do on the train in the morning?  Then your solution is a step backwards for them.  What seemed like a good idea to the team, reduce paper and use electronic delivery, actually was negative impact to the user and to overall productivity.  The user did that work before they got to the office as part of their daily routine.

This is part one of the “For the Common Good” part of my job. The solution that is delivered needs to take into consideration all those that will be impacted and it needs to fit into their lives and, ideally, change their lives for the better.

The second part comes into play during information gathering and sharing about the solution.  The new application or solution needs to be described in terms of the business value and the overall positive value of the change.  If you are going to add work to busy departmental staff, then it better be for something more than “your system”.  It better be for something like improving the enrollment process for students.  It better be for some larger good than simply benefitting the group developing the solution. You need to gather the business process improvements that the new solution will provide and then use those improvements to describe why the solution is important.

The final part has to do with scope.  Often, issues in one group are problems in another group too. Finding co-sponsors is a way of expanding the positive gain for the new processes or solution.   I spend time looking for others who I can bring into the discussion.  I look to see if the problem can be solved once for several constituents.  The broader solution will require collaboration and compromise but it can bring greater value and reduce the chaos of one-off solutions.  If the problem is solved once for many groups, then there is only one solution to maintain and there are many people who can provide input and expertise.

For me, “for the common good” means considering the broad impact, looking for the greatest value and delivering a solution for the largest constituency.  

Uncommon Thinking for the Common Good:

Bringing this all together provides one view on what I do as an I.T. Architect.  I get people to think broadly about a solution.  I get them to use a blank canvas and a rich palette of ideas when thinking of how we should solve a problem.  I also get them to think about how that solution fits in the larger environment, who it will help and who it will impact and finally who else should be brought into the discussion so we can deliver a far-reaching solution.

If I do my job well, then we get truly creative and expansive solutions that fit into the organization, improve peoples lives and help the greatest number of users.

 

Technorati Tags: , , ,

Advanced CAMP – Part 3

Merri Beth Lavagnino – Privacy and Policy

Policy and privacy are really consideration of the human aspects and impacts of technology.  Policies are: strategic direction and operating philosophy (which are usually informal and cultural), Public and Institutional policies (these are both documented and usually legal documents).

Institutional policy – a statement that reflect the philosophies and values of the project, service, organization or federation.  Policies should be clear and concise, applicable across a wide range of activities and should not change very much.

Why create a policy?

  • When reasonable people disagree
  • To guide thinking when making decisions
  • To correct repeated misbehavior
  • When there are significant risks or liabilities
  • In response to external forces like regulation or law

Where does the policy apply?  Federation, Institution, Service

Real-life stories:

  • Email Outsourcing:  vendors proposed that we would do incident response and legal requests for both students and alumni.  There was no policy that said they had to be in charge and n control.  She took the discussion back to the original goals for the project. (1) Improve and add services for students and (2) reduce their costs.  So they did not take on the incident response because that would not reduce the costs.  That was the policy that helped inform the decision.
  • Course Management System:  they changed their course management model.  They began to get incident reports because the new service didn’t match the old policies for the previous system.
  • Virtualization:  They moved to a new virtualized systems.  The old policies where around knowing that super-hot data is on a specific machine, with a specific system admin.  Now, they didn’t know what machine had the data and all sys admins might have access.  Had to expand training and the understanding of how they would manage super-hot data.
  • InCommon Agreement:  Thought that went very well.

“A policy is a temporary creed liable to be changed, but while it holds good it has got to be pursued with apostolic zeal.”  Mohandas K. Gandhi

Privacy:

Categories of privacy harms:

  • Intrusions : They come into your space and contact you and tell you what to do (spam, cold calls)
  • Information Collection:  They watch what you are doing more than they should (tracking, interrogation, etc)
  • Information Processing:  They have a lot of data about you, and they do things with it. (data mining)  Need to watch out for secondary use – collect for one reason then use it for another reason.
  • Information Dissemination:  They disclose data about you, perhaps more than you think they should.  (Transferring data, true or false facts)

Fair Information Practice Principles:  The FTC drafted these principles and they do enforce them.  Higher Ed is not under the FTC’s jurisdiction but users are expecting these principles to be met.  If we don’t

  • Notice/Awareness:  User should be given notice of your information practices, in order to make an informed choice about whether to provide information.
  • Choice/Consent:  User should be given options as to how any personal information collected from them may be used.
  • Access Participation:  Users should be given access to the data held about them, and ability to contest that data’s accuracy and completeness.
  • Integrity/Security:  data should be secure and accurate
  • Enforcement/Redress:  there should be a mechanism in place to enforce fair information practices and it should include appropriate means of recourse by injured parties.  At a minimum, you should right the wrong.

Ken Klingenstein: Federated Identity and Data Protection Law

Good quote from Ken K:  “This is an attempt to bring trust to internet via technology not just because it is just us chickens”.
EU Law Directive 95/46/EC :  You can process personal data when it is required to perform contact, required to satisfy legal duty or consent.

Identity Providers must identify which services are necessary for education and research.  Must inform the users.  May seek users’ informed freed consent to release personal data to other services.  You have to show why it is important.    Should have a data process/data controller agreement with all service providers to whom personally identifiable data is released.  Must ensure adequate protection of any data released to services outside the EU.  We have to play by the EU rules.

Service Providers must consider whether personally identifiable information is necessary for their service or whether anonymous identifiers are sufficient.  You may request personal information from users but you must inform.

There is no normalized definition of what Personal Identifiable Information (PII).  There are questions about email addresses:  if it is a third party email address it might not be but a .edu address might be.  So the content might be more important than the field.

IP Addresses – if it is a dynamic address it is not PII.  So, unless you know it is a dynamic address, then you have to treat it as PII.

EduPerson Targeted ID – this is going to the EU privacy commission this Fall.  It is a 32 bit opaque identifier that is different per site visited.

OASIS Cross-Enterprise Security and Privacy Authorization (XSPA) – just formed group.  A mechanism to allow consent agreements flow with data.  The first and dominant Use Case is health care.  Looking for other Use Cases.  Does this make consent a new service in our loosely coupled service?  Do services need to be consent aware?

Report Out from Discussion Sessions:

Data Modeling Group:

Modeling person and organization data.  Modeling of organization data is remarkably difficult not just in the nature of the data but also in the resistance that you get from organizations to being characterized.  Multiple organization charts – financial, hr and reporting structure.  The characterizations can be political.  Are there pressures that will lead to the marginalization old way of doing things?  Organizations that don’t want to be characterized may not get services.

Service Discovery:

What would a service description look like:  what is it called, cost, how to call it, operational context (where is it physically located).  Discussion about how you describe the service, how do you recognize similar services in distributed locations.  Talked about the grid is doing this with their RNA.

What is happening today: people using Google to search for services and looking for a WSDL.

How do you get consent?  What about promises and claims?  What about a directory of all the services?  What about a directory of directory?  You could have a convention for naming the directory so you could at least find the directories.

DNS works for finding things.

Governance:

Domain Governance – governance revolves around an application or a data element, or attribute (student ID).  These models will have to evolve to domain governance: enrollment, IdM etc.

Who owns the data especially as the data is transformed and sent along the ESB?  Services are requesting the data that can then be used by other services.

SLAs – keeping tracking of who can use the use the service.

The need for a directory of services especially in emergency notification.  There is also a need to know who is consuming services so you can notify on changes.

What is being done now on campuses?  It is evolving on campuses.  Identity and Access Management is a domain that is being governed  as a domain at Penn State.

Saint Louis University has a good examples of domains in higher education that need to be governed as a domain.

Lightening Talks:

Rob Carter:  Tracking and Authenticating IP in Cyberspace

We had all of our resources stored inside the walls of the institution.  We now see with cloud computing and Web 2.0 applications, our intellectual property out in the cloud.  How do we track the reuse of them?  How do we contextualize the content.

How do we know that it is really and artifact of mine and not someone spoofing my creations?

Could solve this with digital signatures.  What if we could add metadata before it goes out into the cloud.  Get a signature of the object and attach the signature to the object or store it elsewhere.

How does this align with Creative Commons licensing efforts.  You can search and crawl for for CC licensed objects that you use.

Loretta Auvil:  Music Analysis.

Dynamic analysis of a Tom Lehrer file.    Very entertaining.

Scotty Logan:  IAM Services and Well Behaved Apps

If every app does its own thing, there is no real management.

Trust the container:  Identity – you can get a user name from Tomcat et al, Authentication, Authorization

Have the container provider the groups and privileges as a URI

OAuth.net – a specification developed by a group to solve the “I want my Flickr protected photos on Facebook but I don’t want to give you my Flickr username and password”.

Technorati Tags: , , ,

Advanced CAMP – Part 2

Dave Gimpl:  Computing as a Service

Infrastructure for vaporware.  They are working on the infrastructure that enables cloud-computing.

Challenges in the data center:  rising costs of the operations, the explosion of data, the difficulty of deploying new application and services, the difficulty in managing complex virtual machine systems.  When you map the business processes, they map to a variety of systems on the data center floor.

Blue Cloud is IBM’s entry in Cloud Computing.  Cloud Computing is holistic systems management.  Similar to Grid or Cluster computing.  A combination of “pervasive virtualization” for both server and storage.  Allows for virtualization across varied hardware (I think).  On demand and autonomic management and Utility Computing (Amazon’s service offering).

They gather up like systems (not necessarily identical) and manage them as a pool.  The focus changes from managing the SAN or server.  You let the “ensemble” manage itself and you manage the Virtual Image.

When the image moves to another system, does it move with state?

North Carolina State’s implementation is open source.  All of the standards are open source.  The ensembles are wrapped with SOAP/SOA interfaces.  At North Carolina State Virtual Compute Lab – a student can request a XP machine to do their project.  They get the machine in increments of 30 minutes.  They are providing service for other institutions in their area.

Ken Klingenstein mentions a paper “The Computational Data Center: The Science Cloud”

Mark Morgan:  Genesis II – Accessible, Standards Based Grid Computing

http://www.cs.virginia.edu/~vcgr

The problems:  we have target grid user that are unable or unwilling to learn new programming tools & paradigms.  Users want the benefit of the grid without having to know about the grid.

Anything you can put a service in front of and put on the internet, is part of the grid.  Telescopes, microscopes, computing power, storage, data, sensors.

Want to share this but sharing in a mutually distrustful domain.

Genesis II implements the standards that come out of the OGF (Open Grid Foundation) to test them and vet them.  Open Grid Service Architecture is part of the OGF.

Grids have been around for a long time but they are being used.  People who design grids want cool features.  User don’t care.  Genesis II is focused on the user and making grids usable.

The Specs:

  • Resource Naming Service (RNS) –  maps human-readable name to web service endpoints.  Supports Add, Remove, List.
  • ByteIO – allows you to treat grid resources like a POSIX-like file resource.
  • Basic Execution Service  (BES) – interface for starting, managing and stopping computing jobs.
  • WS-Naming – Endpoint Identifiers, Enpoint Resolution

You interact with the grid system in “file-like” ways.  Double click on a database query, drag a job onto a server resource, etc.

They use an FTP interface to manage resources on the grid.  On linux side, OGRSH acts as an intermediary between bash and the grid.  Users can do “ls”, “cat”, “cp” and OGRSH will redirect requests into the grid as appropriate.

Nigel Watling: Cloud Computing and the Internet Service Bus

http://biztalk.net

Building out a new data center in Chicago.  Microsoft is deploying 10,000 servers a month to support cloud computing.  Amazon expects their services operation to bypass the retail business soon.

Issues that come up:

  • How do I expose a service broadly?
  • How do I handle identity and access control
  • How do I interoperate?  Between vendors?  Between standards?

Connect their composite application through an ESB to the internal applications and then out to the cloud for distributed resources.

Roland Hedberg:  OM2

http://www.openmetadir.org

OM2 is about representing events and moving information about events from one place to another.  A publish-subscribe messaging system originally designed around IdM.  Implementations in Python, Java and PERL.

Three ontologies:  message, operation and object ontologies.  Message is the header like for mail.  Operation describes the actions (Miro ontology) which includes if-then-else as well as the usual add, modify, etc.  Objects describe the objects.

Messages are based on RDF/XML.  Includes support for Dynamic delegation Discovery System (DDDS, RFC 3401-3).

“Ontology Driven Application Development.”

Example applications:

Eduroam (http://www.eduroam.org) : allows you to travel between universities throughout Europe and use your local credentials to authenticate to the wireless network.

Bologna Process: supporting the movement of students between universities.  Any student should be able to go another university and take a class then come back.  Has admissions control and grade reporting.

What OM2 does:  Transport the information to the correct address at all time by the use of DDDS, by the transport protocol of the receivers choice.

Brian Busby:  ESB at UW-Madison

Talk about our use of the ESB and experience with SOA.

UW-System has been looking at SOA for years (4 or 5 years).  We got to where we were going to buy a commercial SOA suite but we passed on the purchase.  SOA went into hibernation.  Then two projects came along:

  • Course Roster Information Service
  • Course Guide

We made a decision to take advantage of a license for the Cape Clear ESB.  We can take advantage of this.

Interesting impact:  people suddenly had to change their discussion to be around services that they need not big data loads or APIs and they made the change.

Issues:

  • Right-sizing the environment – we don’t know how many people are going to be using the ESB or the load on the services.
  • ESB as a service hosting facility
  • Collaborate development teams (Integration Competency Centers)
  • What aspects of integration should the ESB handle – do you put all the business logic in the ESB, etc
  • Support of the loosely coupled environment

Organization Issues:

  • Governance
  • Ownership of the services, orchestration, operational data stores
  • Security policies
  • Web services granularity
  • Data representation – what XML should we use to represent data
  • Service Level Agreements
  • Service definition & re-use

The fact that we got the ESB in place is driving the conversations that we were having years ago forward finally.

Technorati Tags: , , , ,

Advanced CAMP – Registering, Discovering and Using Distributed Services Part1

R.L. Bob doing the introduction: 

Advanced CAMP could mean to some people the advanced topics beyond just the basics.  Bob likes to think of it as the Advance Camp out in the wilderness where you are more likely to get caught in a blizzard, get shot and generally face the wilderness.

The theme that came out was the needs around service discovery in higher education.  Discussions will cover CyberInfrastructure for Humanities, Cloud/Grid, SOA, ESB.    Discussion groups on data models, governance, service discovery and <your topic here>.

Workshop Format:  Each participant should offer (at least):  1 opinion, 1 rant, 1 hope, 1 keen observation.

The problem space:  SOA is happening across academia in variety of ways varying from Web2.0 apps, mash-ups, messaging.  It happens intra and inter-institutional.  This impacts how we offer a variety of services and raises a set of questions:

  • How should digital tools and data for scholarship be made available?
  • What metadata should be recorded about them?
  • How can metadata be globally aggregated and searched?
  • What operational and security environments should protect them and enable their appropriate use?
  • how should their semantic relationships be codified and maintained?

Mark comments:  connecting metadata to the object and having it persist and stay attached as the object moves around and is copied is a difficult area to address.

Jill:  SOA is also talked about traditional administrative system but do people think about this

Why would academics would want to store their content in a central system?  It might be about the ability to add metadata and re-use the content in multiple places.

Loretta Auvil:  SEASR

http://seasr.org/

Goal was do develop a software environment that would allow for the reuse of software components focused on data mining applications for the humanities.  Looking at text analysis and music analysis doing genre analysis, mood analysis. 

The components and descriptions of those components are very web centric based on SOA and Semantic Web.  They are talking about a Semantic Enabled SOA.  The components are written in RDF.

Looking at interesting ways of searching:  Tag Clouds, Link Flows

Working on a workbench using Google Web Toolkit.  Allows you to do a mash-up of the components into flows.

Example Applications:  MONK – it has a custom UI that calls SEASR as a service.  NEMA – music analysis service that does 10 second slices of an MP3 looks at the genre and mood. 

Steve Masover:  Project Bamboo

http://projectbamboo.uchicago.edu/
Flickr and del.icio.us tags:  projectbamboo

Asking the question:  How can we advance arts and humanities research through the development of shared technology services?

Areas of focus

Discovery and Analysis
Annotate and manage – including the idea of Folksonomic tagging with identifiable levels of authority.

Need to support serendipitous discovery.  Search is not useful if it limits serendipity and foraging.  Intellectual Property pain and accelerating interdisciplinary are motivate “commons-based peer production” (cf. Yochai Benkler) .  There is impatience with copy-write.  There is desire to support inter-scholar relationships.  Community / Networking that support a “lattice of interest”.  Legal and institutional policy are trending towards advocacy around fair use in law.

Emerging aspects of scholarly practice include: shared standards and services, social and scholarly networks, deep consortia across disciplines and national borders.  There is need for a chain-of-credibility in mash-ups.

Looking less on service/tools developments and more on standards-profiling and services to facilitate interoperability.  One area that they might focus on the sharing / tracking of reference use:  who used a resource in what context and for what purpose, who provided the resources to the commons.

We are moving from a wedding cake stack (data and repository, middleware, application on top) to a three-side figure with mash-ups and tools on edge of the triangle.

Ken K – we heard from an English scholar that he is does not do “team english.  He is a cat and he does not want to be herded”. 

There is a tension between scholars wanting to know “who is using their stuff” and but not wanting to their activities monitored.

Daniel Davis:  Fedora Commons

http://www.fedora-commons.org/

Now a 501-3c organization.  Moving from an internal grant-funded project to a community project.

Much of the work is focused on integrated services from other projects rather than re-writing code that already exists.

Splitting into multiple projects: 

  • Fedora Repository – original Fedora Project,
  • Middleware – looking at seamless integration between other groups’ services,
  • Akubra Storage – new storage plug-in architecture, transaction file system,
  • Topaz – core components for semantic-enabled apps currently publishing several journals mostly in medical research,
  • Mulgara Triplestore – highly scalabel triplestore.

Relevant technical trends:  SOA, Web2.0, RDF, OWL and OWL-S

There are two paradigms that we are dealing with:  the lightweight Web model with little trust / security and the Enterprise model where you have deep trust / security models (think HR systems).  A repository can bridge these two worlds.  You can easily repose content then add a trust  model and policy driven controls for adding scholarly information on top of the content. 

The Enterprise paradigm need to support near ACID (atomicity, consistency, isolation, and durability) semantics and a strong security and trust model. 

Question:  The idea that there is a difference between Federated Identity and Federated Repositories and how that would work.    They are different aspects but related. There are discussions about shared information between the repositories like User Accounts.  In one repository, that person might be an account.  In the other, they might be a reference.  How much do you share between the two repositories.

Jens Haeusser:  Kuali Student

http://www.kuali.org/communities/ks/index.shtml

Keys:  Modular, standards-based student system.  Community Sourced rather than open source in that their is a board who sets direction and manages the roadmap.  It is a person centric system – focused on meeting the needs of the users of the system.  SOA-based.

Traditional ERPs – you tend to implement twice.  Once, when you try to make it meet your current practices and then again when you accept the best practices as defined by the vendor.

Functional Vision:  Support the end users by anticipating their needs.  Support a wide range of learners and learning activities (traditional students but also life-long learners, distance learners, exchange students et al).  Design to make it easier to change business processes.  Reduce time staff spend on routine tasks.

Technical Vision:  SOA and Web Services.  Not delivering an application as much as they are delivering a framework for you to deploy your business processes.  Using the Web Services stack:  Standards-based, adhere to Educational Community License (ECL).  Building the system in Java.  Open Source reference Implementation.

Guiding Principles for the KS Technical Architecture as a PDF

The functional design team is gathering input from a broad range of players from both within an institution as well as between institutions.

The first thing they are working on is Learning Unit Management.  Treating it more like SKUs.  You can compose them together to make larger units.  They have learned that the current way many systems define courses isn’t very good.

Technical Recommendations as a PDF

Database:  Apache Derby
Orchestration:  Apache ServiceMix, Sun OpenESB, Kuali Enteprise Workflow (KEW)

Created a standard development environment that includes a submission environment.  Maven and Subversion, Google Web Toolkit (UI).  Business Rule Management System (BRMS) to store and search for business rules includes a UI for business users to define the rules.  Looking at the Fluid Project for support of accessibility/usability requirements.

They are using different ESB for different aspects of the framework. 

Technorati Tags: , , , , ,

ITANA Face 2 Face – Security Architecture

Indiana University

Completed a 10 year Strategic Plan which worked because they connected money to it.  You couldn’t get funding unless you showed how your project connected to one of the 71 strategic initiatives.  Completed a 10 year tactical Telecom Plan.  Instead of replacing 1/4 of the switches every year for four years, they want to replace all switches in one year so they can take advantage of new features.

802.11X access solution based on MAC addresses or logins.  Getting to automated, policy-based network access.  What is the value of this and what have people done in this area?  What are the policy zones?  This can flip it over so that we are both protecting our network from devices as well as protecting devices from our network.

This group could develop some design templates that schools could use in discussions with vendors.

UW-Madison

Should there even be a Security Architecture?  Shouldn’t security be embedded in all of the groups and users?  When Stefan started in 2001, he always was asked, “Why” about security items.  Why do I need to use a firewall?  Why should I have logging turned on?  Set a set of principles:

  • Security is Everyone’s Responsibility
  • Security is Part of the Development Life Cycle
  • Security is Asset Management (classifying the information)
  • Security is a Common Understanding

We have a five step process for doing a risk assessment.  First we agree to the assessment scope, then conduct the assessment, develop a draft report, communicate the findings then re-assess as needed.

Risk = (Impact X Likelihood) / (Mitigation Controls)

Impact is related to costs.  How do you monetize reputation?  You can ask how would you spend to prevent this from happening.  This is a Risk Prioritization process.

How do you balance the security principles against the development principles (scalability et al).

Technorati Tags: , , , , ,

ITANA Face 2 Face: Data Management

Data Management  Discussion:

Key Issues:

  • Data Architecture, Analysis and Design
  • Data Security Management  – data access and security
  • Reference and Master Data Management  – making data available rather than copying data
  • Data Warehousing and Business Intelligence Management – normalizing the data across the data warehouse
  • Document, Record and Content Management –
  • Meta Data Management –

The difference between Structured Data (data in authoritative systems, usually in a database) and Unstructured Data (  ).  The Structured Data was designed by DBA.  These can proliferate silos.  Complex queries are difficult to build and brittle.  The metadata and taxonomy as delivered is often “accepted” without thought as the enterprise definition and taxonomy.  They also include open fields to store what ever you want.

Unstructured data is individually generated, often in file systems, often without much metadata that is meaningful to enterprise.  The rich media formats cannot be easily mined to discover content.  Management is a nightmare with a proliferation of stores and types of content.

Structured Data Gaps:

Data Warehouses:  it was sold as a way to build a bridge across the silos.  The queries are difficult to construct and often take a lot of effort to get written.  It is hard to deliver the complex queries.  All the business logic is missing that is used to develop the data and queries.  There is a gap in the definitions and the data in the warehouse.  You can define student 12 ways so any query could have 12 answers.

There is no business rules repository that lets you figure out how things are defined.  You can build business rules into the database and into the application code.  The farther you get from source, the farther you get from the business rules and the definition and intent for the data.

Data Warehouse is used to buffer the source system from queries.

When we give out reporting tools to individuals in offices, then it locks you into schemas in the data warehouse.  As people develop their queries, it locks down the database table structure.  If you change the schema to make more enterprise sense, then many distributed queries suddenly break.  There are also “experts” who are vested in their interests in the complexity of the data warehouse.  When you streamline and change the process and the queries, you actually threaten the experts.

LDAP as an example:  We bring data from a bunch of sources, we then normalize the data and present it in standard queries for consumption at large.

A place to start:  things that go into an executive dashboard.

Access To Data project that turned into a drive to get large data sets into Excel on the desktop so they could drill around on their own.

Privilege Management: Authorization in application based on name NOT on an institution role.

At UW-Madison, we manage privileges by sneaker-net.  We don’t have access to metadata so that we can generate privileges based on roles.  We don’t have a way to delete someone from all of the systems when they leave or change roles.  The roles of people have states that we have to move them through.

There are multiple organization charts that come into play when you try to define the role(s) the person which can actually be different at the application roles.  Every application also has roles defined and applications do RBAC.  But there needs to be an external system where you manage these people and roles.  There are two views:  one is that there has to be application centric views of roles and privileges, the second is that there could be a set of pre-defined roles that come with a suite of privileges. 

There are a set of RULES which are different than the roles.  The rules must be stored in a repository as well. 

Unstructured Data Gaps:

Electronically recorded lectures, talks etc: We gather some metadata when we create the file like it is the third lecture, created on this date, etc.  We cannot scan these files to get rich metadata.

Unstructured Data Management Architecture from IBM.  It is cycle-intensive.  It looks at 10 second clips of music and adds metadata (like it is “happy music”).    The idea that you can just grind at the problem with power might work for a while.  There are vendor(s) who are working in this spaces.

Just knowing what data exists is an important step.  Storage is just as important.  How long do you archive, repose the data?  At what level of storage should you storage?  The librarians are building dark archives.  They are storing data in hopes that some day we will be able to “do something with it”.  The metadata harvesting and management tools are immature. 

Digitally Signatures:  When we throw stuff out onto the web or into distributed storage, how do we mark the content so we can mine the archives.  “If there was a point to doing it, people might do it.”  Not many people see the value in deploying the systems.

Wikipedia claims that authors are professors who aren’t so their stuff will be taken more seriously.  The ability to express our university membership out in the world at large becomes more important.

Students will be coming to us with digital identities.  They will want to use those identities and we will become another fob on their keychain that they use in the world at large.  We may not be the source of their identities in the future.

All of the data is going to live someplace.  We will not be holding it all but we will need to be able to assert our IP over the data wherever it lives.  Look at the RIAA and their ability to enforce their IP across multiple platforms.

Standardized media formats:  

E-discovery:   When you have an E-Discovery request, it is no longer personal data or institutional data.  What is the impact of distributed storage and the Web2.0 applications on e-discovery requests.    Where is the liability?  Who will be sued?  Don’t change data management practices to because of e-discovery.

Technorati Tags: , , , , , ,